How to read delimited and fixed- field data in VBScript with ADO - VBScript FAQThis FAQThere are many ways you might go about handling delimited- text files in VBScript. We are talking about readable text files that have commas, tabs, and so on between the fields - along with some sort of row or line delimiter character. Fixed- field data files are similar. Plain text generally, but instead of having field delimiters the data occurs in each record in columns of pre- determined widths. This FAQ discusses using ADO and the "Microsoft Text Driver" to handle the dirty work. Text Driver? The basic idea is to use ADODB objects with a connection string like: DRIVER={Microsoft Text Driver (*. Text Driver (*.txt; *.csv)};DefaultDir=C:\Data\MyDir; Connection strings can be tricky, you'll want to be sure you type everything between the 'DRIVER=' and the ';DefaultDir' exactly as shown above. An extra blank space here or there or a typo. Return csv file as recordset up vote 3 down vote favorite 1 I have an external program that exports data into CSV files. My users would like to have access to this data through a VBA function in excel. In order to do this, I thought about wrapping the CSV file read. (VBScript) Read CSV File Demonstrates how to read a.csv file and access the contents. The Chilkat CSV library/component/class is freeware. The downloads for.NET, C++, Perl, Java, Ruby, and Python contain all of the Chilkat classes, some of which are. Default. Dir=C: \Data\My. Dir; Connection strings can be tricky, you'll want to be sure you type everything between the "DRIVER=" and the "; Default. Dir" exactly as shown above. An extra blank space here or there or a typo, and it simply won't work. Data File Format. One of the first things you'll want to ask is "how will the Text Driver know how to interpret my data properly?" This includes the delimiters used, the names of the fields, the data types, and so on. The Microsoft Text Driver is pretty smart. In the absence of other information it will take a cruise through your data file and make an educated guess. In many cases this works just fine. To some extent its guesses are tempered by a number of registry settings on the machine your script will run on. In general though you may need more control over the Microsoft Text Driver in order to get just the results you want every time. Schema. ini. In order to get that control over the way your data files are interpreted you'll need a Schema. This file must be in the same directory as the text files you want to process. It contains section entries for each file in that directory. Detailed information on Schema. For additional details search the MSDN Library for "Microsoft Text Driver" and "Schema. An Example. Sometimes it is best to illustrate a concept with a simple example. Here is a text file called mydata. Add - > 3. 2Delete - > 3. Add - > 1. I've used the symbol - > above to represent tab characters (vb. Tab, Chr(9), HT, whatever). The file has only 3 records and each has 2 fields, one text and another integer. Now I need to create a Schema. Schema. ini file already exists. In this case I have a new directory, a new data file, so I'll make a new Schema. Schema. ini[mydata. Format = Tab. Delimited.
Col. Name. Header = False. Col. 1 = Action Text. Col. 2 = Data. Value Long. We have a section header to identfy the file we're describing. Then we identify the format. Then we say the file doesn't have header record(s) that provide column names. Then we describe each column, assigning a field name and a data type. All of this and more is described at the MSDN page cited above. Using Text Driver From a WSH Script. So the files above set the stage. Now how do we use them to accomplish something? This is where those ADODB objects come in. For a simple case like this all we need is an ADO Recordset object. From WSH I find that ADO programming is much easier if I use the newer, more generalized, and feature rich . WSF script format rather than the older . VBS form of desktop scripting. You don't need to use a . WSF but it has a number of advantages. The two I am taking advantage of here are: Declarative object syntax (the < object> tag). Declarative reference syntax (the < reference> tag). These are described in detail in the Microsoft Windows Script reference. You can download this from: http: //msdn. Sample WSH Script. Enough teasing, the example is only complete if you can see an actual script. Here I have a WSH script called ADOText. I declare an ADO Recordset object that I will use to read the text data file. I also declare a reference to the ADODB type library in order to gain access to the predefined ADO constants. In this case I'm only making use of ad. Open. Keyset but in a more meaningful script you would need many other ADO constants. ADOText. wsf< job> < object id = o. RS progid = "ADODB. Recordset"/> < reference object = "ADODB. Recordset"/> < script language = "VBScript"> Function Script. Path()Dim str. SFN str. SFN = WScript. Script. Full. Name. Script. Path = Left(str. SFN, In. Str. Rev(str. SFN, "\") - 1)End Function Dim s. Conn, s. Source, s. Results. Const c. My. Name = "Fetching Text Data With ADO" s. Source = "SELECT * FROM [mydata. Conn = "DRIVER={Microsoft Text Driver (*. Default. Dir=" & Script. Path & "; " Msg. Box "Records will be counted on your OK!", _vb. Ok. Only, c. My. Name 'Use of ad. Open. Keyset is CRITICAL or Record. Count returns - 1! RS. Open s. Source, s. Conn, ad. Open. Keyset. Msg. Box "File contains " & CStr(o. RS. Record. Count) & _" records.", vb. Ok. Only, c. My. Name Msg. Box "Data will be processed on your OK!", _vb. Ok. Only, c. My. Name s. Results = ""o. RS. Move. First. Dos. Results = s. Results & _"Action: " & o. RS. Fields("Action") & _" Value: " & CStr(o. RS. Fields("Data. Value")) & _vb. New. Lineo. RS. Move. Next. Loop Until o. RS. EOF Msg. Box s. Results, vb. Ok. Only, c. My. Name o. RS. Close< /script> < /job> I'm using a function here called Script. Path( ) to retrieve the path of the running script. Notice how I used this function to build my connection string. I use this here because I have placed the script in the same directory as my data and schema files. If the script were elsewhere I'd need to designate the data directory in some other manner. The example simply opens an ADO Recordset, displays the number of records, and then displays the data contained in the Recordset. Nothing fancy, just an example. Notice that since I didn't create o. RS via Create. Object( ) I also do not have to Set it to Nothing in order to decrement the object reference count. WSH takes care of this automatically for declared objects. You can copy each of the three files above into appropriately named files in any directory to try this out. Just be sure to replace the "tab" symbols with real tab characters. To run the script just double- click on ADOText. Conclusion. This should be enough to get you started with the Microsoft Text Driver. If you have MS Access you'll find it has a wizard for creating complex Schema. If you need to handle many files with numerous fields you might want to take a look at it. A good article on ADO can be found at: http: //msdn. This discussion is an introduction to ADO for VB programmers, but most of it will be extremely useful to a VBScripter as well. Even client- side DHTML scripters and ASP scripters should find it enlightening. The "ADO + Text Driver" approach has many advantages over writing clumsy VBScript code using the FSO and the Split( ) function. Especially when your data contains things like date & time or currency fields. It is even more handy for processing fixed- field data files! Beats the heck out of doing a bunch of Mid( ) calls over, and over, and over again. Plus you get the advantage of sorting and filtering if you use client- side cursors! ADOText. 2. wsf< job> < object id = o. RS progid = "ADODB. Recordset"/> < reference object = "ADODB. Recordset"/> < script language = "VBScript"> Function Script. Path()Dim str. SFN str. SFN = WScript. Script. Full. Name. Script. Path = Left(str. SFN, In. Str. Rev(str. SFN, "\") - 1)End Function Dim s. Conn, s. Source, s. Results. Const c. My. Name = "Fetching/Sorting Text Data With ADO" s. Source = "SELECT * FROM [mydata. Conn = "DRIVER={Microsoft Text Driver (*. Default. Dir=" & Script. Path & "; " Msg. Box "Records will be counted on your OK!", _vb. Ok. Only, c. My. Name o. RS. Cursor. Location = ad. Use. Client 'Sorting. 'Use of ad. Open. Keyset is CRITICAL or Record. Count returns - 1! Ok. Only, c. My. Name Msg. Box "Data will be processed on your OK!", _vb. Ok. Only, c. My. Name s. Results = "" o. RS. Sort = "Data. Value ASC" o. RS. Move. First. Dos. Results = s. Results & _"Action: " & o. RS. Fields("Action") & _" Value: " & CStr(o. RS. Fields("Data. Value")) & _vb. New. Lineo. RS. Move. Next. Loop Until o. RS. EOF Msg. Box s. Results, vb. Ok. Only, c. My. Name o. RS. Close< /script> < /job> Happy scripting! Reading Delimited Files Using ADOExecutive Summary: Although you can use a Text. Stream object to open a plain- text file, it's not advantageous because you can only read the file from the beginning. However, you can use ADO, which enables you to use the Microsoft Jet OLE DB text driver to parse the contents of a delimited file, to read delimited files. The Jet OLE DB text driver uses the registry to determine the format of the delimited file. Reading delimited files is a common scripting task. For example, you might want a script to process a list of users and their email addresses that you exported to a delimited text file using the Microsoft Management Console (MMC) Active Directory Users and Computers snap- in. Delimited files are plain- text files that often represent the contents of a database table. The data on each line in the file is separated by a delimiter (e. Figure 1 shows a sample delimited file in which the delimiter is a comma. This type of delimited file is known as a comma- separated values (CSV) file. The first line of the file, which is referred to as the header line, names the fields (columns) in the table. The subsequent lines contain the table's records (rows). In this example, the field names and record data are enclosed in double quotes ("). Of course, you can use the File. System. Object's Text. Stream object to open plain- text files, but parsing the contents of delimited files presents some interesting problems. For example, you could open the file and use VBScript's Split function to split each line into fields, but doing so can present a problem if the line contains quoted data with an embedded delimiter. For example, consider the following comma- delimited address data: "1. Ellison Rd. NW","Albuquerque, NM","8. Now suppose you use the following line of VBScript code to parse the line: Data = Split(Address, ",")In the above line of code, assume the Address variable contains the address text mentioned above. If you use this code, the Data variable will contain the four- element array shown in Figure 2. The reason is that the Split function doesn't take the double quotes into account. Embedded delimiters are very common when dealing with AD distinguished names (DNs). Another problem with trying to use a Text. Stream object to handle database data is that you can read the file only starting from the beginning. If you need to go back to a previous line, you must close the file and open it again, which can be time- consuming if you've got a large text file. The ADO Solution. Rather than deal with these limitations, you can use ADO to read delimited files. ADO lets you use the Microsoft Jet OLE DB text driver to parse the contents of a delimited file. The text driver treats a delimited file as if it were a database table. The following steps provide a general overview of what you must do to use ADO to read a delimited file: 1. Create a Connection object to establish a connection to the datasource. When dealing with a delimited file, the datasource is the delimited file's directory. Create a Recordset object and query the text file using a SQL query. Iterate the Recordset object to obtain the query's results. Let's take a more detailed look at these steps. Listing 1 shows a VBScript example of how to read the data from Sample. Figure 1). At the top of Listing 1, the script declares its constants and variables, and then it specifies the directory and filename for the delimited file. The directory and filename are specified separately for reasons that I'll discuss later in the article.) Callout A shows how Get. Addresses. vbs creates a Connection object and executes the Open method. The Open method's parameter is a connection string that describes the location and nature of the datasource, and it uses a semicolon- delimited list of property=value pairs. The Provider property is always Microsoft. Jet. OLEDB. 4. 0, and the Data Source property is the directory containing the delimited file. This is why the script declares the CSV file's directory separately—the Data Source property must be a directory name. Declaring the directory separately also provides you with the flexibility of querying more than one delimited file using a single Connection object.)The Extended Properties property is a semicolon- delimited string, and it must be enclosed in single or double quotes. The first argument in the string is Text, which tells the data provider (i. Jet OLE DB) to use its text driver. Next, the Hdr parameter (which must be Yes or No) specifies whether the delimited file has a header line. If you use Hdr=No, the data provider names the fields F1, F2, F3, and so forth. The default value is Hdr=Yes. In the past, I've seen an additional Fmt=Delimited argument inside the Extended Parameters string, which seems to be a way of telling the data provider the type of delimited file you're using. For example, Fmt=Tab. Delimited.) However, as far as I can tell, the Fmt parameter is ignored, and the text driver will always use the setting specified in the registry or from the Schema. I'll describe both the registry location and the Schema. Next, you need to use a Recordset object to query the delimited file. Callout B shows how Get. Addresses. vbs creates the Recordset object and calls its Open method. The Open method's first parameter is a SQL statement, and its second parameter is the Connection object. The correct values for the final three parameters are specified as constants at the top of the script. For more information about these parameters, see http: //msdn. Get. Addresses. vbs uses the following SQL statement to read the contents of the delimited file: SELECT * FROM \[File\] ORDER BY Last. Name. File is the CSV file's name as specified at the top of the script. The script uses square brackets (\[\]) around the file's name in case it contains spaces. This query selects every record in the table and sorts the results by the Last. Name field. Because it's a SQL query, it gives you lots of flexibility in retrieving data from the text file. For example, in a SQL query, you can use SELECT DISTINCT to return only unique records, WHERE to specify a filter, and ORDER BY to sort the results. See msdn. 2. microsoft. SQL query. After calling the Open method, the Recordset object contains fields and records in a table format. Because Get. Addresses. Sample. csv file shown in Figure 1, the Recordset object will look like the table shown in Figure 3. Note that the Recordset object's contents are sorted by the Last. Name column, as specified by the SQL query.) The column headers are the fields' names, and the rows are the individual records. You can navigate through the Recordset object by using the Move. First, Move. Next, Move. Previous, and Move. Last methods. The end of file (EOF) property returns a value of True if the Recordset object contains no records or if you've moved past the last record (e. Move. Next method). To access the data in an individual field in the current record, use the following syntax: Recordset. Fields. Item("fieldname")In this command, Recordset is the Recordset object, and fieldname is the name of the field. Callout C shows how the Get. Addresses. vbs script displays the data from the CSV file. The Do Until loop executes when the Recordset object's EOF property is False, and the WScript. Echo method echoes the data for the current record. Controlling the Text Driver. As I mentioned previously, the Jet OLE DB text driver uses the registry (or the Schema. I'll describe shortly) to determine the format of a delimited file. These values are stored in the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4. Engines\Text registry subkey. Table 1 lists the two main values (i. Disabled. Extensions, Format) that you might need to modify. For security reasons, the text file driver allows only certain file extensions to be considered as delimited files, so if you need to work with a delimited file that has an extension that's not listed in Table 1, you'll need to rename the delimited file or add the file's extension to the Disabled. Extensions value in the registry. The other thing you can do to control the text driver's behavior is to create a plain- text file called Schema. Inside the Schema. Underneath that, you can specify a set of values that describe the format and layout of the delimited file. Table 2 lists some of the possible values you can use. Figure 4 shows a sample Schema. Accounts. tab file is tab- delimited. For more information about how to use the Scema. Note that you can also use the text driver to read fixed- length files (meaning that the data in the file is stored in specific columns); however, because of space limitations, I won't describe how to do that here. Using ADO to Read Delimited Files. ADO lets you process the data in delimited files much more easily than using string parsing. Now that you know how to set up ADO and control the Jet OLE DB text driver, you should be able to read delimited files without having to deal with many of the time- consuming limitations that come with using the Text.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
October 2016
Categories |