Learning Objective: Explore how systems integrate directly at the data layer, understand patterns like ETL and MDM, and examine how organizations conduct Business-to-Business (B2B) integration using Electronic Data Interchange (EDI).
1. What is Data-Level Integration?
Data-level integration involves sharing, consolidating, or migrating data directly between the databases or data stores of different applications. Instead of asking Application A to send a message to Application B via an API, a background process directly reads the database of Application A and writes it into the database of Application B.
Pros and Cons
Advantages:
Highly efficient for moving massive volumes of data (Batch Processing).
Does not require the target application to have a well-defined API.
Ideal for Business Intelligence (BI) and Data Warehousing.
Disadvantages (Risks):
Bypasses Business Logic: Writing directly to a database skips application-level validations, risking data corruption.
Tight Coupling: If Application A changes its database schema (e.g., renames a column), the integration breaks.
Security risks due to exposing raw database credentials.
2. Common Data Integration Patterns
Organizations typically use the following patterns to integrate at the data layer:
ETL (Extract, Transform, Load): The process of extracting data from various operational systems, transforming it (cleaning, joining, formatting), and loading it into a centralized Data Warehouse for analytics.
Database Replication: Copying data from a master database to one or more slave databases in real-time or near real-time (often used for high availability and disaster recovery).
Master Data Management (MDM): A strategy to create a single, unified "golden record" for critical business entities (like Customers, Products, or Employees) across multiple disparate systems.
Figure 1: Standard ETL (Extract, Transform, Load) Architecture
Example: Data Transformation via SQL (PL/SQL)
In legacy systems, much of the data integration and transformation was handled using advanced database scripts like PL/SQL. Below is an example of an "Upsert" (Update or Insert) operation synchronizing customer records:
MERGE INTO DataWarehouse.Customers dw
USING CRM.Customers crm
ON (dw.CustomerID = crm.CustomerID)
WHEN MATCHED THEN
UPDATE SET dw.Phone = crm.Phone, dw.LastUpdated = SYSDATE
WHEN NOT MATCHED THEN
INSERT (CustomerID, Name, Phone, LastUpdated)
VALUES (crm.CustomerID, crm.Name, crm.Phone, SYSDATE);
3. Business-to-Business (B2B) Integration: Introduction to EDI
While database integration works well inside an organization, you cannot easily share direct database access with external partners (suppliers, distributors, clients) due to security and network constraints. This is where B2B Integration comes in.
Electronic Data Interchange (EDI) is the computer-to-computer exchange of standard business documents in an electronic format between business partners. Developed in the 1970s and 1980s, it replaced paper-based systems like faxing and mailing Purchase Orders and Invoices.
Standards: EDI relies on strict formatting standards like ANSI X12 (North America) and EDIFACT (International/Europe).
Transmission: Originally exchanged over private Value-Added Networks (VANs), today EDI is often transmitted over the internet securely using protocols like AS2 (Applicability Statement 2) or SFTP.
Translation: Because EDI formats are designed for machines and bandwidth efficiency (not human readability), organizations use "EDI Translators" to map EDI data into their internal ERP systems (like SAP or Oracle).
Code Comparison: EDI vs. Modern JSON
To understand why EDI requires specialized translation middleware, look at an example of an ANSI X12 850 (Purchase Order) document compared to a modern JSON equivalent.
ANSI X12 EDI (Purchase Order)
Highly compressed, positional, delimited by specific characters (like `*` and `~`).
Discussion Prompt for Students: If JSON is so much easier to read and process using modern programming languages (like Python or Node.js), why do massive global supply chains and logistics companies still rely heavily on legacy EDI standards today? Discuss the concepts of standard inertia, legacy investment, and industry compliance.