How machine learning can improve software development itself

, 6 min read

One topic will doubtlessly be among the leading trends of 2017: machine learning.

The idea behind it is not news in and of itself. But our current age of digitalization has given machine learning a whole new, wide-ranging impact: large volumes of data and a high level of automation increase the need for intelligent connection and evaluation of these data.

Typical areas of application include

  • chat bots used in sales, marketing and service
  • driver assistance systems and autonomous driving in the automotive industry
  • smart manufacturing and predictive maintenance in the manufacturing sector

High expectations on machine learning

The activities of many globally active IT corporations prove that machine learning will be high on their lists. Be it Google, IBM or Microsoft – all of them have made machine learning an important component of their business strategies. In addition, the tech giants have been recruiting entire competence teams and acquiring machine learning and AI startups.

While IT, automotive, telecommunications and media are among the pioneers of this development, more traditional industries such as the chemicals sector, logistics/transportation and pharmaceuticals are already awaiting their turn.

This makes me wonder whether machine learning can offer genuine value to the field of software development itself.

The increasing complexity of software systems

During the past 10 to 15 years, software solutions have become considerably more complex. Many contemporary systems contain thousands of individual components that are connected to each other via APIs; even simple tasks frequently “have to” be handled across a whole range of different interfaces. This does not exactly simplify the process of software development.

The system complexity has been increasing since 1960

Further information: Million lines of code (information is beautiful)

But this is exactly where machine learning can make things easier instead:

  • Automatic error detection and troubleshooting: A state-of-the-art application can help you quickly identify standard error patterns in the code. Machine learning goes a step further: not only does the technology detect API usage constraints when using the standard library, it also detects them for all other libraries – completely automatically. In addition, the system provides you with suitable solutions for specific problems.
  • Intelligent programming assistants: Developers spend up to 50% of their time reading documentation and source code. Programming assistants that are based on machine learning reduce this time by more than half: they detect which task the developer is working on and offer context-related support in real time, such as suitable code examples, usage statistics and recommendations.
  • Clean code: Many companies rely on best practices in their software development process, e.g. for naming variables and structuring their source code. Quality assurance is still largely carried out manually. This means: high effort at a high cost. Machine learning can automate this process entirely by detecting and documenting best coding practices. The technology continuously checks whether the naming conventions and structure in the company-wide code repository meet quality requirements.
  • Automatic refactoring and migration: Large-scale refactoring is a frequent necessity, especially when upgrading a library – from version 1.0 to 2.0, for instance. Again, machine learning can make things considerably easier: the technology draws from a vast amount of sample source code to learn typical migration patterns, which it then applies to the existing code for refactoring. The actual effects on the entire code basis are already visible in advance, which makes migration processes significantly more cost-effective.

Developing companies are already working on innovative solutions

A quick market screening has shown that several teams and companies are already looking into the use of machine learning, especially for the field of software development;

1. Kite

Kite is an intelligent programming assistant for Python. The software indexes code samples that are freely available on the internet and makes them available to other developers in their own (text) editors. This can take the form of an intelligent code completion system, for example, in which relevant code suggestions are ranked by their frequency of occurrence and displayed accordingly. The software also provides context-based code snippets. In addition, it collects and supplies documentation from different online sources.

Besides the (free) basic version, Kite is available as a Pro version and an (on-premise) Enterprise version.

Status: publicly available
Since when: April 2016
License: proprietary, free version available
Website: kite.com
Twitter: @kiteHQ
Headquarters: San Francisco, USA

2. Ctrlflow Insights

Ctrlflow Insights by Codetrails is a smart software suite designed especially for Java developers. This complete solution consists of three components: an automatic error detection tool based on machine learning and two intelligent programming assistants (context-based code completion + comprehensive code repository search engine with detailed querying and aggregation options).

Open-source integration into the Eclipse development environment is available for Ctrlflow Insights. Since 2012, it has been a fixed component of the Eclipse IDE for Java developers. Enterprise customers have access to an on-premise solution.

Status: publicly available
Since when: January, 2012
License: proprietary, trial version available
Website: https://ctrlflow.com/insights/enterprise/
Twitter: @ctrlflow
Headquarters: Darmstadt, Germany

3. QuantifiedCode.com

QuantifiedCode.com is a platform for automatic code reviews, designed specifically for Python code. The crowdsourcing component of QuantifiedCode.com is particularly interesting: it allows developers to write their own code checker and share it with the Python community. Not only can users register their own Github repository on QuantifiedCode.com, they can also have it checked entirely automatically and free of charge. This check is based on all code checkers that the community has shared already.

Status: publicly available
Since when: April, 2014
License: proprietary
Website: https://www.quantifiedcode.com
Twitter: @quantifiedCode
Headquarters: Munich, Germany

4. Codota

“Your AI Pair Programmer” – such is Codota’s tagline.

Codota offers similar functions as those of Kite, but it was developed for Java instead of Python. The solution analyses public code repositories such as Github or Bitbucket and extracts code samples from discussions on Stack Overflow. Code samples that match the development context are displayed in a separate application window, where users can copy them directly into their own source code with just a few clicks.

Status: publicly available
Since when: 2013
License: proprietary, free version available
Website: https://www.codota.com
Twitter: @Codota_
Headquarters: Darmstadt, Germany

5. Acellere Gamma

Acellere Gamma describes itself as an “AI-powered Software Analytics Platform”. The solution comes with a management dashboard that allows users to visualize different code metrics, such as the number of lines of code or coupling between objects. In addition, the software offers support for typical design challenges, helping you to avoid “God Classes” and “Feature Envy”, for example. The system automatically calculates and suggests possible solutions.

Unfortunately, I could not find out how exactly machine learning is used in this context.

Status: publicly available (Q3/2017)
Since when: 2017
License: proprietary
Website: http://www.acellere.com
Twitter: @acellere_
Headquarters: Frankfurt, Germany

6. Source{d}

The Source{d} vision: a future in which artificial neural networks allow software programs to write their own code. To make their vision come true, Source{d} is developing a universal code representation, which models all relations between the individual code elements, code rules, etc. of an API. On this basis, it creates an intelligent programming assistant that gradually implements increasingly large parts of the software system entirely autonomously.

An exciting vision. It remains to be seen, however, how quickly Source{d} can really turn it into reality.

Status: not publicly available
Since when: March, 2015
License: proprietary, some open-source libraries
Website: http://sourced.tech
Twitter: @srcd_
Headquarters: Madrid, Spain

Notes on the market screening:
This overview is an excerpt and does not claim to be exhaustive.

Machine Learning will shape the future

Today, we can already say that Machine learning most certainly has a bright future ahead. Its victory march will depend on what added value the technology provides to companies and their value creation chains and how well this value will translate into economic figures.

Based on my personal insights, I believe that machine learning can advance the field of software development itself considerably. There is clear added value.

I think that software developers and machine learning experts need to work together even more closely. This will allow them to understand the challenges facing the other party, on the one hand, and develop truly outstanding, marketable solutions, on the other.

I will be grateful for any links, recommendations and opinions on this topic and look forward to an inspiring exchange.

Cover image source: nd3000/Shutterstock.com