Like the “brain freeze” children get from eating ice cream too fast, some of us get weird phantom headaches every time “data integration” comes to mind.
We’re not referring to the fundamental concept of integration, which connects core business apps and data systems in helpful ways to keep businesses humming along.
Rather, we’re talking about the snowball effect of disparate, uncoordinated, piecemeal integrations that can roll through a single enterprise. Integrations may get out of control, especially for larger organizations that enthusiastically embrace scores of new apps without an integration policy.
The abundance of HR apps, for example, is a direct source of escalating data flows. One website alone lists 783 HR apps for sale. These apps, many cloud-based, have innovative functions employees love to use.
But uncoordinated apps and data flows can lead to duplication and conflict.
Enter the plug-in connector. Connectors are a modular form of middleware. They use the Application Programming Interface (API) of apps to connect with those apps.
Connectors sit between two APIs. They’ll receive information from one app or solution, and process it to make it understandable and accessible to another app or solution. They are like translators or messengers.
You can use connectors to connect with almost any app or data source, including information from business partners, data from various databases, data from SaaS apps via APIs, from IoT devices, and from many sources that use standard formats and protocols such as HTTP, FTP, JMS, XML, JSON, and many more.
The convenience of connectors means almost anyone can link systems, often without considering cumulative or incompatible effects in the overall business structure.
If you integrate without an underlying plan or guiding policy, it can lead to issues such as:
duplication of functions
conflicting software updates
users negotiating a bewildering menu of apps and interfaces, and
higher overall IT costs.
Companies can get overwhelmed by the volume and the poor quality of integrations without central oversight.
Signs You May Have Runaway Data Integration
Integration is a beautiful thing. It enables you to access many data formats from different sources that normally could not communicate with each other.
That unified viewpoint enables better decision-making. It also can have many other benefits, including reduced costs, increased efficiencies, timely access to information, and resilience and adaptability to changing conditions.
If you have too many uncoordinated integrations, however, that could become a problem.
Here are some signs you may have such a problem.
Some of your “one-off” integrations are simply incompatible—some data remains tied up in silos.
Your many data integrations are churning out lots of data that you are not using.
Your insights are coming too slowly. You might have slow ETL pipelines. Or perhaps your integrations aren’t keeping up with today’s need for real-time data analytics.
What started as a simple and successful point-to-point approach has now become too complex to maintain.
You can’t re-use some of your data for subsequent data integration projects.
You have some automated data sources that duplicate information in different parts of the business.
How to Control Runaway Integration
- Define your business goals for integration. This may sound obvious, but many firms fail to ask this of themselves before embarking on integration projects.
Also, goals and circumstances change over time; the organization-wide integration approach should change to suit. What worked 10 or even five years ago might not work so well today.
- Do a data audit. Analyze recurring problems to see if integrations are the problem. Also, look at your existing important data sources and systems to see if they are serving you well, or if they could be more rationally connected.
- Analyze and streamline the integration of your business platforms. This might mean adjusting existing solutions or ditching them for a new approach, especially if your current systems are misaligned.
With the increasing convergence of technologies, the importance of effective cross-platform connections has never been more relevant than today. For example, cloud talent management platforms and customer relationship management (CRM) systems these days must interact with many other enterprise systems. To understand the impact of talent management practices, people need access to financial, production, and organizational performance data.
- Establish data management and data governance policies. These organization-wide policies will impose structure, minimize potential mistakes, and help make the data you gather more relevant for business decision-making.
Data governance to the rescue
Data governance, according to TechTarget, means managing the availability, usability, integrity, and security of data in enterprise systems, based on internal data standards and policies.
Data governance can help offset or completely avoid runaway integrations. Governance policies help answer questions like:
Who owns the data?
Who can access what data?
What security measures protect data use and privacy?
How much of the data complies with new regulations?
Which data sources are approved to use?
Meanwhile, data management, according to the Gartner Glossary, is the creation and implementation of practices, architectures, policies, and tools to consistently access and deliver data to smoothly run all apps and processes in a business.
To put it another way: data governance creates guiding data policies and procedures, while data management enacts those policies to compile, secure, and use that data for decision-making. Both need to work together to deliver the best results.
When you marry good data governance with the best practices in data management, your runaway integrations should soon become a distant memory. You’ll be able to better collaborate with all your business users to select and implement data integration solutions more strategically and holistically.
Trends in data integration
Some general trends in data and data integration include:
- Rise in volume. Data is multiplying. There's a related challenge of ensuring good data quality of the increased amounts of data. This means today’s data systems must be efficient and scalable.
- No-code interfaces. To cope with the increased volume of data processing needs, some software firms are making it easier for some users to do parallel processing and data partitioning, with the use of a “no-code” interface. This is helpful for enabling on-demand data for decision-makers.
- Spatial data. Another trend is the increased importance of spatial data in apps such as Google Maps or systems such as global positioning systems. TechTarget defines spatial data as any data that references a specific geographical location. Users can save spatial data in many formats, and the data can contain more than just geographic information—for instance, one may analyze it to understand how variables affect different individuals or communities.
- Augmented Reality. Enabled by spatial data, Augmented Reality is being used in handsets and underlying technology from Google and Apple, e.g. enabling 3D representation of buildings in maps.
- Real-time data processing. “Time to decision” gives firms an advantage over their competitors. It’s becoming more desirable for firms to be able to process variable loads and data types in real or very fast times. More firms use real-time data streaming to enable quick decision-making.
- Hybrid integration. A trend that began in 2021, it continues as more organizations seek flexibility in using their own preferred mix of on-premise, cloud, and edge data sources and solutions.
Approaches to data integration today
Meanwhile, some major approaches to data integration today include a mix of traditional Extract-Transform-Load (ETL) approaches that often happen in a centralized data center or data warehouse, to newer, more distributed approaches involving Data Visualization. Here are some more details.
- Data Consolidation: Extract-Transform-Load (ETL).
Data is pulled from main data sources (often network servers), reformatted, and standardized (transformed), before being loaded into a data store for analysis. Traditionally this was done in batches, e.g. once every few hours.
- Data Propagation.
This duplicates data from one or more sources to another. Data from source data warehouses is copied and transferred to local access databases. Data copies can be sent at different times or at the same time.
Enterprise Data Replication (EDR) copies data from one system to another and is used to send large amounts of data between remote and central sites.
- Data Federation.
This uses middleware as a bridge to standardize disparate data sources into a unified, virtual network and single interface for accessing distributed data with different data models. It operates using Structured Query Language (SQL). SQL search requests can return results in real-time.
- Data Virtualization.
Data Virtualization is a hot new trend, and one source predicts that through 2022, 60% of all organizations will implement Data Virtualization as a key delivery style in their data integration architecture.
Data Virtualization is an evolution of Data Federation—in its advanced form, it doesn’t need to apply a data model. It retrieves and manipulates data from many disparate sources, without requiring data formatting or even the geographical location of the data. Data is not copied or moved from the source but integrated virtually. This approach enables real-time information, self-service data, centralized metadata, security, and governance.
Pixentia is a full-service technology company dedicated to helping clients solve business problems, improve the capability of their people, and achieve better results.