ADO.NET and Entity Framework (EF)
In the landscape of .NET development, bridging the gap between object-oriented application logic and relational database storage is a fundamental challenge. There are various data access strategies, ranging from the raw metal of ADO.NET to the sophisticated abstraction of Object-Relational Mappers (ORMs).
Based on recent technical notes, I want to break down the core architecture of these systems. We will explore how to manage connections securely, the trade-offs between connected and disconnected data models, and how to leverage Entity Framework (EF) Code First workflows effectively without sacrificing performance.
I. The Metal: ADO.NET Fundamentals
At its core, ADO.NET provides the foundational libraries to communicate with data sources. It standardizes how we connect, execute commands, and retrieve results, regardless of whether we are talking to SQL Server, Oracle, or MySQL.
The Architecture: Providers and Connections
The architecture relies on specific Data Providers (like System.Data.SqlClient or System.Data.OleDb) that implement standardized interfaces such as IDbConnection and IDbCommand.
A critical concept here is Connection Pooling. Establishing a physical connection to a database server is expensive. ADO.NET providers optimize this by keeping a pool of open connections active. When you call Open() on a connection object, you are often grabbing an existing connection from the pool; when you call Close(), you return it to the pool rather than severing the link to the server.
Best Practice: Always wrap your connections in using blocks or try-finally structures to ensure the connection is disposed (returned to the pool) immediately, preventing leaks.
The Security Vector: SQL Injection
One of the most persistent vulnerabilities in data-driven systems is SQL Injection, often caused by string concatenation in commands.
Consider a scenario where we want to update a user’s bio in a forum application.
❌ VULNERABLE APPROACH (String Concatenation)
1
2
3
4
// If userInput is: '; DROP TABLE Users; -- -> the database is destroyed.
string query = "UPDATE Users SET Bio = '" + userInput + "' WHERE Id = 1";
var cmd = new SqlCommand(query, conn);
cmd.ExecuteNonQuery();
✅ SECURE APPROACH (Parameterization)The IDbCommand interface supports Parameters, which treat input strictly as literal values, never as executable code.
1
2
3
4
5
6
7
8
9
10
string query = "UPDATE Users SET Bio = @Bio WHERE Id = @Id";
using (var cmd = new SqlCommand(query, conn))
{
// The database engine treats @Bio strictly as data
cmd.Parameters.Add(new SqlParameter("@Bio", SqlDbType.NVarChar) { Value = userInput });
cmd.Parameters.Add(new SqlParameter("@Id", SqlDbType.Int) { Value = 1 });
conn.Open();
cmd.ExecuteNonQuery(); // Returns int (rows affected)
}
II. Data Access Patterns: Connected vs. Disconnected
When retrieving data, we generally choose between two models: the DataReader (Connected) and the DataSet (Disconnected).
1. The Connected Model (DataReader)
The IDataReader (e.g., SqlDataReader) is a forward-only, read-only stream. It is highly performant and memory-efficient because it holds one row in memory at a time. However, it requires the connection to remain open for the duration of the read.
Use Case: High-performance APIs or processing massive reports where you don’t need to update the data immediately.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
using (var conn = new SqlConnection(connString))
{
var cmd = new SqlCommand("SELECT Username, LastLogin FROM Users", conn);
conn.Open();
[cite_start]// ExecuteReader returns a cursor to the result
using (var reader = cmd.ExecuteReader())
{
[cite_start]while (reader.Read()) // Advances to the next record
{
[cite_start]// Access columns safely, potentially checking for NULLs
string user = reader["Username"].ToString();
// Process data...
}
}
}
2. The Disconnected Model (DataSet)
The DataSet acts as an in-memory database cache. It is populated by a DataAdapter, which manages the connection automatically (opening it, fetching data, and closing it immediately).
Use Case: Desktop applications (WPF/WinForms) where a user needs to edit a grid of data offline and sync changes later.
1
2
3
4
5
6
7
8
9
10
11
12
13
var dataSet = new DataSet();
var adapter = new SqlDataAdapter("SELECT * FROM Users", connString);
[cite_start]// Fill: Opens connection -> Fetches Data -> Closes Connection
adapter.Fill(dataSet, "Users");
// We can now traverse 'dataSet' without an active DB connection
foreach (DataRow row in dataSet.Tables["Users"].Rows)
{
Console.WriteLine(row["Username"]);
}
// Changes can be pushed back using adapter.Update()
III. The Object-Relational Mapper: Entity Framework
While ADO.NET is powerful, it lacks strong typing. We query creating generic objects, and the compiler cannot verify if our SQL column names match our class properties. Entity Framework (EF) solves this “impedance mismatch” by mapping C# classes to database tables.
Code First Workflow
In modern .NET development (especially with EF Core), the Code First approach is dominant. We define our domain model in C# first, and the framework generates the database schema.
Example Domain Model: Let’s imagine a Logistics system with Warehouses and Shipments.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// POCO (Plain Old CLR Object) class
public class Warehouse
{
public int Id { get; set; } // By convention, this becomes the Primary Key
[Required] // Data Annotations define schema constraints
[StringLength(100)]
public string Location { get; set; }
// Navigation Property: Defines the relationship
public ICollection<Shipment> Shipments { get; set; }
}
public class Shipment
{
public int Id { get; set; }
public string TrackingCode { get; set; }
public int WarehouseId { get; set; } // Foreign Key
public Warehouse Warehouse { get; set; }
}
The DbContext
The DbContext is the “Unit of Work.” It tracks changes to entities and manages the transaction when writing to the database.
1
2
3
4
5
6
7
8
public class LogisticsContext : DbContext
{
public DbSet<Warehouse> Warehouses { get; set; }
public DbSet<Shipment> Shipments { get; set; }
protected override void OnConfiguring(DbContextOptionsBuilder options)
=> options.UseSqlServer("...");
}
LINQ: Strongly Typed Queries
Instead of writing raw SQL strings, we use LINQ (Language Integrated Query). This allows the compiler to check types and syntax at build time.
1
2
3
4
5
6
7
using (var db = new LogisticsContext())
{
// EF translates this C# expression into T-SQL
var activeWarehouses = db.Warehouses
.Where(w => w.Location == "New York")
.ToList();
}
IV. Performance Pitfalls: Loading Strategies
One of the most common performance issues I see in code reviews is the N+1 Query Problem, caused by improper loading of related data.
Lazy Loading vs. Eager Loading
Lazy Loading (The Trap): If you access a navigation property (like warehouse.Shipments) without loading it first, EF may dynamically query the database at that exact moment.
1
2
3
4
5
6
7
8
// 1 Query to get Warehouses
var warehouses = db.Warehouses.ToList();
foreach (var w in warehouses)
{
// DANGER: This executes a NEW SQL query for EACH iteration!
Console.WriteLine($"Shipments: {w.Shipments.Count}");
}
If you have 100 warehouses, you just executed 101 database queries.
Eager Loading (The Solution):Use the .Include() method to instruct EF to fetch related data in the initial query using a SQL JOIN.
1
2
3
4
// 1 Query total: Fetches Warehouses JOIN Shipments
var warehouses = db.Warehouses
.Include(w => w.Shipments)
.ToList();
Tracking vs. No-Tracking
When you are strictly reading data (e.g., for an API response) and do not intend to modify it, use .AsNoTracking(). This disables the overhead of the change tracker.
1
var readOnlyData = db.Shipments.AsNoTracking().ToList();
Overhead of the change tracker in Entity Framework primarily stems from the memory and processing required to monitor the state of entities during the lifecycle of a DbContext.
- Snapshot Storage: When an entity is queried and tracked, the
DbContextmust store the original state of that entity in memory. This doubles the memory footprint for that data because the context holds both the current instance and the original values to detect changes later. - Comparison Logic: The actual tracking mechanism involves a comparison process. When
SaveChanges()is called, the context compares the current state of the entities against the stored original state to determine which properties have been modified. This comparison process consumes CPU cycles. - Registration Management: The
DbContexthas to actively register and maintain lists of all new, deleted, and retrieved entities to ensure they are properly synchronized with the database.
V. Advanced Capabilities
Direct SQL Execution
Sometimes, LINQ generates suboptimal SQL, or you need to call a Stored Procedure. EF allows you to drop down to raw SQL while still returning strongly typed objects.
1
2
3
4
5
6
var id = new SqlParameter("Id", 5);
// Using string interpolation here is safe ONLY if using EF Core's FromSqlInterpolated
// or passing parameters as shown below to avoid injection
var result = db.Warehouses
.FromSqlRaw("SELECT * FROM Warehouses WHERE Id = @Id", id)
.ToList();
Migrations
Managing database schema changes is done via Migrations. Instead of manually altering tables, you create a migration code file that describes the change (e.g., “AddColumn”). This version-controls your database schema.
- Modify C# Model: Add a property
public int Capacity { get; set; }toWarehouse. - Add Migration:
dotnet ef migrations add AddCapacity. - Update Database:
dotnet ef database update.
Caution: Always inspect the generated migration files. EF might interpret a “Rename” as a “Drop and Create,” which would result in data loss unless manually corrected.
Attribution: This post is based on an interpretation of the educational material “ADONET Entity Framework” BME .
