By: Evan Sterling, Analyst Consultant; Praescient Analytics
One of the most important aspects of any big data project is data modeling. Data modeling creates the structure your data will live in. It defines how things are labeled and organized, which determines how your data can and will be used and ultimately what story that information will tell. If the software tool you’re using for your data is the brain, data modeling defines how the neurons connect with each other. Data modeling choices will need to be made early in any software deployment and will have wide reaching impact on the overall success of the project.
Many software platforms, such as Palantir and Semantica, have a component known as ontology that is used to setup classification and taxonomy. It’s essential in the early stages to get this as right as possible, as this will determine how each and every bit of information that gets pulled into your system is treated and how it can be used. Both example systems, Palantir and Semantica, have Dynamic Ontologies. Dynamic Ontologies have flexible rules that allow you to create the structure of your data from the ground up. After deployment, these Dynamic Ontologies can be changed to incorporate new knowledge as necessary. However, changing the ontology too radically can make it difficult to achieve consistent search results from the database. Solidly built and maintained ontologies allow these software platforms to hold viable institutional knowledge across many sets of users no matter who joins or leaves the organization, acting as a sort of information guiderail that matches the capabilities of the tool with the analytic needs of the user.
Ontology is essential in two dimensions; it must meet the specific needs of your domain and it must adapt over time. A counter-terror (CT) analyst needs vastly different ontology from a supply-chain analyst or an agricultural analyst. Each researcher needs a different classification system with groups that matter to them. For example the CT analyst may want to study transactions and leadership structures while the agricultural analysts may need to link techniques and locations with effective production. So, while they may both draw from the same datasets, the agricultural database may find it vital to split farms into crop sections and the CT analyst is only concerned with the owner and whether the crops are legal or controlled. The ontology must also survive the test of time. For CT deployments, the emergence of a new terrorist group, or the splitting of an existing one, must not break your ontology. That’s why the input of data experts from the very beginning is so important. Someone who has the proper knowledge and experience can help the organization build an ontology that is both uniquely suited for their project, but still robust enough to weather any changes as the project matures and develops.
Praescient Analytics helps analysts save time and get more out of their data by focusing their ontology on the right distinctions and level of details. Because our personnel work across so many fields (financial, legal, commercial, defense, intelligence etc.) our analysts can help you custom build your ontology no matter how niche your needs and no matter what questions you bring to the table. With any big data project, Praescient can help you get the most out of your information.