Loading Bulk data to SQL using OpenXML


Loading Bulk data to SQL using OpenXML. Passing XML Document as parameter. XML annotation and hierarchy support. Explanation with examples.

Bulk data transactions using OpenXML



OPENXML is a new function added to SQL Server 2000 that provides a rowset view over an XML document. Since a rowset is simply a set of rows that contain columns of data, OPENXML is the function that allows an XML document to be treated in the familiar relational database format. It allows for the passing of an XML document to a T-SQL stored procedure for updating the data.

OpenXML – to summarize



-- It extends the SQL Language
-- It is used within T-SQL Stored Procedures
-- XML Document passed as parameter
-- It uses row and column selectors utilizing XPath
-- It supports the following:
-- Attribute and element-centric mappings.
-- Edge table rowset.
-- XML annotation/overflow column.
-- Hierarchy support.

OpenXML – Syntax

OpenXML(idoc, rowpattern,flags)
[WITH (SchemaDecl | Tablename)]
Parameters
? idoc
Is the document handle of the internal representation of an XML document. The internal representation of an XML document is created by calling sp_xml_preparedocument.
sp_xml_preparedocument - Reads the Extensible Markup Language (XML) text provided as input, then parses the text using the MSXML parser (Msxml2.dll), and provides the parsed document in a state ready for consumption. This parsed document is a tree representation of the various nodes (elements, attributes, text, comments, and so on) in the XML document.
sp_xml_preparedocument returns a handle that can be used to access the newly created internal representation of the XML document
? Rowpattern
Is the XPath pattern used to identify the nodes (in the XML document whose handle is passed in the idoc parameter) to be processed as rowsflags

? Flags
Indicates the mapping that should be used between the XML data and the relational rowset, and how the spill-over column should be filled. flags is an optional input parameter, and can be one of these values

? SchemaDecl
o in-line meta-data for relational view(column_name1 column_type1 [colpattern1], …, column_namej column_typej [colpatternj])
? Tablename
o existing table to obtain meta-data for relational view
? Edgetable
o if neither SchemaDecl or Tablename is specified


The following part of the article describes the usage of OPENXML function to insert multiple rows of data in a single database call. This can be an effective alternative to looping through an array and calling a stored procedure to insert a row each time.

The example provided inserts 10 rows into a table, so the OPENXML approach is cutting the database calls from 10 to 1 in this case. This minimization of database calls can translate into significant performance and scalability benefits. Each time a database call is made, network and database resources are utilized. The more demands you make for these resources, the more likely you will experience degradation in your application's performance. OPENXML enables you to, in essence, package data together in a single call (as XML), map it to a rowset view, and execute all of the inserts within the same database call which results in a minimization of the utilization of these resources.

CREATE PROC sp_insert_employee @empdata ntext
AS
DECLARE @hDoc int
EXEC sp_xml_preparedocument @hDoc OUTPUT, @empdata
INSERT INTO Employee
SELECT *
FROM OPENXML(@hDoc, '/Employee')
WITH Employee
EXEC sp_xml_removedocument @hDoc
GO

You can see that the only parameter passed to the procedure is the XML passed as a varchar. Depending on the size of the XML string you are working with, the XML string input parameter can be (n)char or (n)text in addition to (n)varchar. The @hDoc variable is required by the sp_xml_preparedocument as an output parameter.Sp_xml_preparedocument is a SQL Server system stored procedure that creates an internal representation of the XML document passed to it, and returns this document handle in @hDoc.

The OPENXML function accepts three arguments, the first two of which are required. The first argument is the document handle that you created by calling sp_xml_preparedocument. This tells OPENXML which XML document you are working with. The second argument is an XPATH (XML Path Language) pattern used to identify the nodes in the XML document.

Each node identified by the XPATH pattern corresponds to a single row in the rowset generated by OPENXML. In our example, there are 10 < Employee> nodes each representing a row in the rowset. The third argument is optional and specifies how the mapping should occur between the rowset created by OPENXML and the XML document. The default is attribute-centric, which means XML attributes of a given name are stored in a column in the rowset with the same name

The WITH clause allows you to specify a Schema declaration (to specify additional mapping between a column in the rowset and a value in the XML document) or the table name if the table already exists with the desired schema.

The example does a simple insert into the Employee table, and since the XML document was created specifically to insert multiple rows into the Employee table, it is sufficient to specify the table name Employee in our WITH clause.

The last statement, EXEC sp_xml_removedocument @hDoc, is called to remove the XML document from it's storage location in the internal cache of SQLServer.

In summary, the new OPENXML function in SQL Server 2000 can be useful for processing multiple table inserts within a single database call. The ability to map an XML document to a rowset representation of a specified portion of the XML document within a stored procedure can maximize the efficiency with which repetitive type inserts are accomplished.
You can also update and delete rows with XML using OPENXML. Without going into specifics the process is basically:
1. Create an internal representation of the XML document with SP_XML_PREPAREDOCUMENT
2. Perform the UPDATE / DELETE using the FROM OPENXML () WITH ... syntax, referencing the internal representation of the XML document
3. Destroy the internal representation of the XML document with SP_XML_REMOVEDOCUMENT
An example of how to use OPEN XML for updating/deleting records is given below:

CREATE PROC sp_update_employee @empdata ntext
AS
DECLARE @hDoc int
exec sp_xml_preparedocument @hDoc OUTPUT,@empdata
UPDATE Employee
SET
Employee.fname = XMLEmployee.fname,
Employee.lname = XMLEmployee.lname
FROM OPENXML(@hDoc, '/root/Employee')
WITH Employee XMLEmployee
WHERE Employee.eid = XMLEmployee.eid
EXEC sp_xml_removedocument @hDoc
SELECT *
from Employee
FOR XML AUTO

CREATE PROC sp_delete_data @empdata ntext
AS
DECLARE @hDoc int
exec sp_xml_preparedocument @idoc OUTPUT, @doc
DELETE Customers
FROM OPENXML (@idoc, '/ROOT/Customer/Order/OrderDetail',2)
WITH (OrderID int '@OrderID',
CustomerID varchar(10) '@CustomerID',
OrderDate datetime '@OrderDate',
ProdID int '@ProductID',
Qty int '@Quantity') b
WHERE Customer.CustomerID=b.CustomerID
EXEC sp_xml_removedocument @idoc


Summary



--Leverages existing relational model for use with XML
It provides the following:
-- A mechanism for updating database with data in XML format
-- Multi-row updates in single stored procedure call
-- Multi-table updates leverage XML hierarchy
-- Queries that join existing tables with XML data


Related Articles

Extract images from word document using OpenXML

This resource is about to extract images from word document using OpenXML. File extension should be .docx. DOCX means XML Documents which helps you to read files easily and perform any action on the system because it is light weight.

More articles: openxml

Comments

Author: ketan Italiya18 Oct 2013 Member Level: Gold   Points : 2

hey,

thanks nice understanding.

for importing data into SQL Server and then parsing the XML into a relational format.

1.Import XML data from an XML file into SQL Server table using the OPENROWSET function
2.Parse the XML data using the OPENXML function

Thanks,
ketan



  • Do not include your name, "with regards" etc in the comment. Write detailed comment, relevant to the topic.
  • No HTML formatting and links to other web sites are allowed.
  • This is a strictly moderated site. Absolutely no spam allowed.
  • Name:
    Email: