I love to cook on the grill – steaks, hamburgers – really anything. But the first cookout of the summer always goes the same way. It is time to start cooking, but the grill must get a good cleaning and the propane tank must be re-filled. Now I cannot do the “good stuff” (grilling and eating) until I do the prep-work.
The same issue exists with most IoT (Internet of Things), data warehouse and machine learning projects. Before the “good stuff” (data analysis) can get started, we must do all the pre-work. The data must be gathered, cleaned, joined and aggregated, often from multiple data sources. Called ETL (Extract, Transform and Load), it is commonly a slow, uninteresting and difficult process. More importantly, time and money are spent doing the data acquisition and cleaning instead of the data science.
Mariner recently began a project for a company which has a manufacturing process and I was lucky enough to get to work on it. The plan was to use machine learning in Microsoft’s Cortana Intelligence Suite (which I love) to learn something about how the plant might operate more efficiently. This was done by studying the control settings for processes in the plant and see how changes in those control settings affected operational efficiencies. There are many temperatures, tank levels, chemical measures and flow rates which are captured from plant operations. Our execution plan is shown below. (The image came from an OSIsoft Regional Seminar.)
The company provided us with access to OSIsoft’s PI Integrator for Microsoft Azure which was released September 2016. This is the first time I have used one of the PI Integrators. OSIsoft’s PI System is one of the world’s most widely used technologies for the Industrial Internet of Things. I was pleasantly surprised to see how easy it was to get the data. PI Integrator allowed us to get data from many locations in the plant very easily and publish the data to a Microsoft Azure database. It has a user friendly drag and drop GUI interface.
First I created a view and gave it a name. The view name becomes the name of the table in the database used to hold the data. Then, the necessary assets and measures are selected. These become the columns of the table. The Next button takes us to a tabular listing of the selected information with a sample of the data. Here we can change the column names, column order, set column filters and much more.
PI Integrator let us aggregate data for different time periods. It also understands units of measure. In our case, most of the plant data was captured every minute. Our analysis was done at the half hour or daily level. We simply chose a daily time period and the appropriate aggregation from the list on the left. Nice and quick.
The tool also has the capability to set a Calculation Basis. The Time Weighted options are the ones we use most.
Then comes the payoff – the publish. When you publish, a table is created in the Azure environment of your choice and the data is uploaded to the table. As you can see below, this can be a once only or scheduled upload.
Now that the data is in an Azure database, we can use all the great tools in the Cortana Intelligence Suite to do the analysis. What a great pair, this OSIsoft and Microsoft collaboration! Get the data quickly and go to work. I would encourage you to take a look at both of these tools. AzureML comes into play as I create several machine learning experiments to learn how changing the settings on the manufacturing controls affect the other measures and the outcomes at the end of the line.
The trends and results are visualized using Microsoft’s PowerBI. These are published back into the cloud for sharing.
OSISoft’s PI Integrator combined with Microsoft’s Cortana Intelligence Suite allowed us to collaborate, get the data and go into the analysis phase with no headaches. We were able to make progress quickly. That is the way it should be!
Now I think I’ll go clean the grill.