Data science cases
Data science is the term used to describe extracting and analyzing knowledge from data by means of techniques and theories drawn from academic disciplines such as mathematics, statistics and information technology. The input for data science may be big data. Big data is the term used for data sets that are so large or complex they can’t be processed by regular database management systems. CQM has 40 years of project experience with data science — 25 years longer than the term has even existed.
Below you’ll find four projects we’ve delivered that include various aspects of data science and BIG DATA that CQM had to address. How did we deal with them? Or rather: how does data science work in practice?
In a new-build university hospital in Canada all goods such as food, medicines, linen and waste are brought to their destinations by AGVs (Automated Guided Vehicles). CQM supplied the systems for this and the software to manage it. A great many movements and decision moments are involved in routing the AGVs (VOLUME). But what made the task even more challenging was that the AGVs also have to transport goods between different floors in different hospital wings. CQM was asked to develop an algorithm for the most efficient elevator management in the three hospitals wings (Prescriptive). Flow is important in order to use the available lift capacity smartly. In addition, high-priority AGVs (e.g. with hot food) get priority. By developing a simulation, CQM provided insight into the duration of the various process steps and it became apparent which times a smart algorithm could influence (Descriptive). Moreover, the algorithm has to provide an optimal answer within 0.5 seconds (VELOCITY).
AgroEnergy is the energy specialist for the agricultural sector in the Netherlands. The company helps market gardeners achieve optimal returns from their energy trading. Recently, AgroEnergy introduced BiedOptimaal (OptimalBid), an innovative plug-in that makes the bidding process even easier for market gardeners: at what price do I buy-in gas and electricity, and in what quantities? BiedOptimaal determines APX bids, minimalizing the heating costs for the grower.
CQM helped AgroEnergy develop BiedOptimaal. The project was executed following a data science approach. BiedOptimaal daily, just before the APX bids have to be submitted. Many different data sources are used in real-time (VARIETY). The most recent data, for example on weather forecasts, as well as just-synchronized buffer capacities are continuously employed to quickly arrive at an advised bid. The predictions for heating requirements and energy prices (Predictive) are brought together in an optimization model that determines the optimum bid for the grower (Prescriptive). BiedOptimaal has been available to growers since October 2014 and in 2015 was also made suitable for lighting gardeners. AgroEnergy now wants to offer a similar service to other sectors, such as the built environment.
In 2008, Corus (now Tata) brought in CQM for a data science assignment. In this case, it was to help reduce customer complaints about steel quality. The input came from a huge variety of data sources (VARIETY) within Corus. By placing strong emphasis on the goal, it was possible to clearly identify the data that was relevant. To exploit its potential, however, large amounts of data (VOLUME) had to be linked together, which wasn’t easy as the data existed in different formats. In addition, the data contained hard-to-detect contaminations (VERACITY). There were also practical challenges in following steel rolls through the production process. Steel rolls are rolled up and out several times, so that the material sequence is mirrored each time. In addition, steel rolls are rolled increasingly thinner, whereby the length increases and the location of surface defects shifts and is smeared out. Corus used the data for quick quality monitoring (Descriptive) and began improvement projects for the most common customer complaints (Predictive).
Together with ProRail, CQM has developed the Infra-Monitor application. This allows one, in combination with other data sources, to visualize the infrastructure of the railway and the timetable (VARIETY). With this detailed information, users can answer a variety of questions through interactive data analyzes (Descriptive). An example being the RisicoRegister Rijwegen (Track Risk Register) analysis. Using this analysis, it’s possible to systematically identify those points in the infrastructure where the risk of collision between two trains is above average. The analysis can also provide insights into the effect of measures such as flank cover switches, ATBVV and overshoot length (Predictive). By then visualizing the results in a schematic layout of the track, the experts get direct insight into the measures taken and the related consequences. ProRail has already used the RisicoRegister Rijwegen to analyze various safety studies.
Want to know more about the various aspects of data science and big data?
See CQM’s vision in this area.