Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Applies to:
Azure SQL Database
Overview
With the Scale out with Azure SQL Database, you can create sharded database solutions. Multi-shard querying is used for tasks such as data collection/reporting that require running a query that stretches across several shards. (Contrast this to data-dependent routing, which performs all work on a single shard.)
- Get a
RangeShardMap
(Java, .NET) orListShardMap
(Java, .NET) using theTryGetRangeShardMap
(Java, .NET), theTryGetListShardMap
(Java, .NET), or theGetShardMap
(Java, .NET) method. See Constructing a ShardMapManager and Get a RangeShardMap or ListShardMap. - Create a
MultiShardConnection
(Java, .NET) object. - Create a
MultiShardStatement
orMultiShardCommand
(Java, .NET). - Set the
CommandText property
(Java, .NET) to a T-SQL command. - Execute the command by calling the
ExecuteQueryAsync
orExecuteReader
(Java, .NET) method. - View the results using the
MultiShardResultSet
orMultiShardDataReader
(Java, .NET) class.
Example
The following code illustrates the usage of multi-shard querying using a given ShardMap named myShardMap
.
using (MultiShardConnection conn = new MultiShardConnection(myShardMap.GetShards(), myShardConnectionString))
{
using (MultiShardCommand cmd = conn.CreateCommand())
{
cmd.CommandText = "SELECT c1, c2, c3 FROM ShardedTable";
cmd.CommandType = CommandType.Text;
cmd.ExecutionOptions = MultiShardExecutionOptions.IncludeShardNameColumn;
cmd.ExecutionPolicy = MultiShardExecutionPolicy.PartialResults;
using (MultiShardDataReader sdr = cmd.ExecuteReader())
{
while (sdr.Read())
{
var c1Field = sdr.GetString(0);
var c2Field = sdr.GetFieldValue<int>(1);
var c3Field = sdr.GetFieldValue<Int64>(2);
}
}
}
}
A key difference is the construction of multi-shard connections. Where SqlConnection
operates on an individual database, the MultiShardConnection
takes a collection of shards as its input. Populate the collection of shards from a shard map. The query is then executed on the collection of shards using UNION ALL
semantics to assemble a single overall result. Optionally, the name of the shard where the row originates from can be added to the output using the ExecutionOptions
property on command.
Note the call to myShardMap.GetShards()
. This method retrieves all shards from the shard map and provides an easy way to run a query across all relevant databases. The collection of shards for a multi-shard query can be refined further by performing a LINQ query over the collection returned from the call to myShardMap.GetShards()
. In combination with the partial results policy, the current capability in multi-shard querying has been designed to work well for tens up to hundreds of shards.
A limitation with multi-shard querying is currently the lack of validation for shards and shardlets that are queried. While data-dependent routing verifies that a given shard is part of the shard map at the time of querying, multi-shard queries do not perform this check. This can lead to multi-shard queries running on databases that have been removed from the shard map.
Multi-shard queries and split-merge operations
Multi-shard queries do not verify whether shardlets on the queried database are participating in ongoing split-merge operations. (See Moving data between scaled-out cloud databases.) This can lead to inconsistencies where rows from the same shardlet show for multiple databases in the same multi-shard query. Be aware of these limitations and consider draining ongoing split-merge operations and changes to the shard map while performing multi-shard queries.
Related content
Not using elastic database tools yet? Check out our Getting Started Guide. For questions, contact us on the Microsoft Q&A question page for SQL Database and for feature requests, add new ideas or vote for existing ideas in the SQL Database feedback forum.