Data Science Blog: Understand. Implement. Succed.

In this post, I want to share how Python can be used to automate the documentation of machine-learning (ML) experiments using AsciiDoc. The search for the best-performing ML model is an empirical process, which involves fitting models with differing parameters and evaluating their predictive performance. Only after a multitude (e.g. hundreds or thousands) of models have been evaluated, is it possible confidently proclaim that a suitable model has been identified. The major challenge of running vast numbers of experiments is that they are time- and compute-intensive because results usually have to be delivered within a certain time frame (e.

Radar visualizations for technological choices have been pioneered by ThoughtWorks. In the meantime, many organizations have created their own tech radars to map out which technologies should be considered for use by members of the organization. The German online fashion retailer Zalando has even made the source code of their tech radar publicly available. Since technological decisions for data science and AI projects are distinct from conventional applications, I decided to adapt Zalando’s tech radar.

Protocol buffers (Protobuf) are a language-agnostic data serialization format developed by Google. Protobuf is great for the following reasons: Low data volume: Protobuf makes use of a binary format, which is more compact than other formats such as JSON. Persistence: Protobuf serialization is backward-compatible. This means that you can always restore previous data, even if the interfaces have changed in the meantime. Design by contract: Protobuf requires the specification of messages using explicit identifiers and types.

Are you a researcher in data science? Are you in desparate need for GPU ressources for your next project? Then you should know that a GPU server may be just around the corner. HOSTKEY is currently hosting a competition where you can win a grant for free GPU ressources. The competition is open to all researchers in the data science sphere. Application Criteria for the Grant Program If you want to apply, you have to send the following information:

Companies usually have firewalls in place, which ensure that the internal network is protected. To access the outside world, all traffic must be routed through a proxy. When you are using the standard operating system (typically Windows), you are automatically authenticated with this proxy. However, when you are using a non-standard operating system (e.g. through a virtual machine running Linux), you are not automatically authenticated with the company’s proxy. The sad result: you won’t be able to access the internet out of the box.

Flask is a lightweight Python web development framework that is becoming more and more popular, as you can see from this comparison against Django.

AWS (Amazon Web Services) certifications are among the most lucrative certifications in the IT sector. This is due to the growing demand for professionals with cloud expertise, as more and more companies are adopting cloud technology. Furthermore, AWS upholds high quality standards when it comes to certification. So, while certification can be challenging, there is a lot to learn along the way. I only recently had my first exposure to cloud computing when I took on a DevOps role in industry in 2019.

The Cambridge Dictionary defines plagiarism as ‘the process or practice of using another person’s ideas or work and pretending that it is your own’. In the last years, there have been several famous Germans who lost their PhD titles due to plagiarizing their doctoral theses. In Germany, VroniPlag is the largest open community that analyzes scientific work with respect to plagiarism. Most notably, in 2011, Guttenplag (a specific group of plagiarism hunters) published a detailed analysis of the doctoral thesis by Karl-Theodor zu Guttenberg, the German defense minister at that time.

Let’s say you are currently adding new arguments to an installation script for your software. After some work, your commit history may look different than you would like.

When I started working in the IT sector, I was impressed by the large number of different roles that exist and it took me quite a bit of time to understand their individual responsibilities. That is why I thought it would be nice to share my understanding of the most common roles you will encounter in IT projects. You should definitely read this post if you are thinking about applying for position in the information technology sector but are unsure which one is the right fit for you or if you’re already working in IT and want to improve your understanding of other roles.

Posts

Automating the Documentation of ML Experiments using Python and AsciiDoc

Introducing the Data Science Tech Radar

The Essential Protobuf Guide for Python

Boost your Data Science Research with a Free GPU Server

How to Bypass Corporate Firewalls?

REST API Development with Flask

Becoming an AWS Certified Cloud Solutions Architect Associate

Plagiarism in Academia

Rewriting History with Git

The Roles You will Encounter in Most IT Projects