Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Zen Section
id739853429

The Data import function under System>Batch process menu>Data import/export>Data import provides access to a function for creating RefTracker questions from XML formatted data (file loading for batches and a web service for individual questions), as well as access to the XML based Time and cost Journal import process (file loading).

 

Question import

API


The Question import function accessible from Batch process menu>Data import/export>Data import allows question information in a defined XML format to be imported into RefTracker in such a way that a new question (or questions) is created.This API (application programming interface) can be used for one off imports of a batch of questions (such as importing records from a previous Request management system), or for individual question submission via a web service (such a submission of questions referred for other systems (for example, the already existing RefTracker to RefTracker functionality uses this web service).

Using the process for bulk loading question data from an individual file of records

To use this process choose "Question importQuestions" in the import type drop down of the System>Batch process menu>Data import/export>Data import screen.

The XML schema to which the files of question data that is to be imported must be formatted, can be viewed by clicking the appropriate link in the top right of this screen. The schema is generated at the time that you want to view it and so will always be up to date for the database from which it has been extracted. The schema defines the fields for which data can be supplied in XML format to create a question. To be valid, the data for a question MUST include:

  • Question text
  • A response method and matching client contact details for the response method, or, if no Response method data is supplied, client contact details for the default response method.
  • If no other data is supplied, default values will be automatically supplied for all other mandatory fields (for example, if Receipt date is not specified, it will be set to the date and time that the question is created). Where the data will be saved into a RefTracker code table, the schema lists the valid values that can be accepted into that field. If the field is mandatory and no value is provided in your data, the default value (indicated by an asterisk) will be used.


Please contact Altarama if you want to use this functionality, as a small amount of consulting from us on how to import your records will save your own staff considerable work in created the XML formatted input records, and can provide better results in relation to the usability of the data.

As a guide, sample data import files that comply with the XML schema are provided in the exchange/import/request/sample directory.

To run a question import process ensure that the data to be loaded is in the appropriate RefTracker defined XML schema format:

  1. Load your XML formatted file to <YourRefTrackerDomain>/exchange/import/question/input, either directly or by using System>Utilities>Administration utilities>File/Upload download.
  2. Your file will then appear in the Batch process menu>Data import/export/Data import screen if you have selected the "question importQuestions" Import type. Select the file that you want to process by ticking the box to the left of its name.Image Removed
    Image Added
  3. Then select whether the import process should import the question using the Normal question number sequence, or whether a historical numbering sequence should be used (such as when loading request information from a previous request management system).
    For the historical numbering sequence set the prefix to be used (clearly identifying the data as having come from a different source – the ZZZ example is a good choice when loading older records as it ensures that the older records file at the end of any search results) and then set the starting number (Set first question #) for the first imported record in this import file. You can only set a historical sequence prefix and start number once. Once the historical numbering sequence is set, future loads will offer the next historical sequence number.Image Removed
    NOTE: Take a note of the first number that will be used for this import so that you can easily find the questions created from the import. As their date of receipt may well be quite different to today's date, they will most likely NOT be showing at the top of any search that you do!
  4. Then click Submit. An "Are you sure" type popup will display to give you time to consider whether the right selections have been made, and when you select "OK" in that popup, the import process will begin.
    Note that this import process is disk intensive and so should be
    run out of normal operating hours
  5. A summary of the import process will display on your screen as it runs and will also be saved to a log file in the reftracker/exchange/import/log directory. If your run summary indicates any records failed to load, click the "View log file" link to see the details.
    When running the process, any incorrectly formatted XML records will NOT be loaded. Their record number will be recorded in the process log, so that they can be corrected and reloaded. 
  6. When you have finished viewing the summary of the import run, click "Return to Data Import page" to go back to RefTracker.

After running the import process, with a successful data load, you will notice that the file that you process is no longer showing in the Data import screen – it has been moved to the exchange/import/request/archive directory (you can view it by downloading it using System>Utilities>Administration utilities>File upload/download).

After running the import process, with any questions not successfully loaded, the input file remains showing in the Data import screen so that the unprocessed records can be corrected and rerun.


Questions created in this way are identical to any other question with that same data. This of course may mean that the question receipt date will be as specified in the XML data, as will other mandatory question attributes like Received via, etc.The status of the question after loading will be determined by the data provided in the XML data from which the record was created. If the question does not have a closed status but it does have a closed date/time then the status of the new question will be set to closed. Also if the question has a closed status but does not have closed date/time then the question will similarly be set to closed.

By default, the question will be created using the labels provided by the "Default form (System)" (which you can modify if you need to), but Altarama can help you to use a specific form if required. As this is a bulk load, any emails associated with the form used to load the question are NOT sent, and allocation is determined by the data that is loaded rather than the form.  Any questions loaded where fields that are designated as mandatory in the form do not have data loaded into them, will be set to "Incomplete" and will cause the Change screen to display when accessed so that the empty mandatory fields can be completed.  Deliberately loading a record with an empty mandatory field will ensure that this review process is instigated for the loaded record/s.

If the XML record is invalid, but contains data that cannot be mapped to the required RefTracker field for any reason, that data will not be lost – it will be added to the "Other imported data field of the created question.



The history entry created will clearly show that the record was created by the import process, as shown in the screen print below:


Using the Question import web service

This web service is designed to allow data for one question at a time to be submitted – for example, transfer of a question from your IT department's call tracking system to RefTracker, or transfer of the data entered into a form provided by your IT department. Note that use of RefTracker created forms is always preferable to data being input via a form designed elsewhere as there is always the risk that the form designed elsewhere could become incompatible with changed values in RefTracker (code table values changed in RefTracker, but not changed in the external form, for example).

The same schema accessible from the System>Batch process menu>Data import/export>Data import screen, determines the format required to submit question data to the RefTracker question import web service at <yourdomain>/service/addQuestion.

There is limited validation built into this RefTracker import process at this time, so it is important to ensure that the format being used for this data importing is not only compliant with the schema, but is well tested to ensure that it will always create a valid question. Invalid records will simply NOT be loaded.

Please contact Altarama for more information if you want to use this web service.



Time and cost journal entry import

RefTracker allows costs associated with responding to requests, and librarian hours spent developing responses to requests (which can be automatically converted into costs by RefTracker), to be entered as answers are developed using the RefTracker screens (the Answer, Third party and Quick screens respectively provide "Cost incurred" and "Time used" fields for entry of this information).

The RefTracker "Data extract" statistical reports show the total costs accumulated in relation to a question, allowing libraries to report on the hours spent, or costs of, developing a response for billing or management purposes.

Sometimes costs are incurred externally in relation to RefTracker questions, and information about those costs need to be loaded into RefTracker so that the total cost being reported by RefTracker includes those external costs. For example:

  • Online information providers are now making information about the costs incurred in using their information, available as post usage reports. This Time and cost journal entry loading process provides an invaluable opportunity to load this usage information in bulk from the information provided in the file supplied by the Online information provider.
  • Information from invoices received for other post billed costs such as consultants fees and ordered services could also be loaded in this way.

This Time and cost journal entry loading Batch process allows costs and/or time to be loaded as single batch process from information in a published XML file format.

Summary of how to use the Time and cost journal entry loading process

  1. An input file will have entries containing the RefTracker question number, and the associated cost, and/or time, to be loaded to that question. Input files provided by Online provider's, for example, may need some preprocessing to be in the required input format. Full details of the schema for the layout of the required input file, and how to achieve that format are provided as point 1 later in this section.
  2. Load the correctly formatted "input file" into the reftracker/exchange/import/journalentry/input directory
  3. To use this process choose "Time and cost journal import" in the import type drop down of the System>Batch process menu>Data import/export>Data import screen.
  4. The screen will display a list of "input files" waiting to be processed.
  5. Tick the files that you wish to process and then click on Submit.
  6. The selected file/s will be processed by creating a journal entry in the appropriate RefTracker question/s for each cost or time entry in the input file/s. Full details of the processing procedure are provided later in point 2 of this section.
  7. The processed records are moved to a similarly named file in the reftracker/exchange/import/journalentry/archive directory and any unprocessed records are left in the input file the reftracker/exchange/import/journalentry/archive directory, together with a comment about why they could not be processed. A record detailing the processing problem is also written to a file named with today's date in the reftracker/exchange/import/journalentry/log directory. Full details of how errors are handled are provide in point 3 of this section.
  8. You may wish to remove files from the reftracker/exchange/import/journalentry/archive directory after processing and checking.

1. Details of the Time and cost journal entry input file

You may receive files of cost and/or time information in a number of different formats from a number of different vendors. These files need to be pre-processed to achieve a layout that is suitable for loading into RefTracker – the RefTracker journal input XML schema.

At present the COUNTER standard (http://www.projectcounter.org/about.html) only defines how online database usage information should be presented to users, however, the managers of this standard recognize the opportunity for costs to also be provided in this way. As Online information providers will increasingly be providing their usage reports in COUNTER compliant format, the RefTracker Time and cost journal entry Loading process presumes that the information will be provided in a format based on the COUNTER standard, but with additional columns for costs and hours as appropriate.

The COUNTER standard requires files to be in XML format, and this is the emerging standard for data transfer for the future, so RefTracker Time and cost journal entry loading also requires XML format.

Your input data will need to be presented in the RefTracker Time and cost journal entry Loading input file format which is defined by the following XML definition (this format is also provided in a file called journal entry.xsd in the reftracker/exchange/import/schema directory) and can be viewed from the Journal import Schema link in the System>Batch process menu>Data import/export>Data import page.

Here's an example of how that schema might be used:

Code Block
themeEclipse
languagehtml/xml
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<journaldata>
	<journalentry>
		<journal_amount>500.98</journal_amount>
		<transaction_message>Invalid question number [PROB9999]</transaction_message>
		<question_no>PROB9999</question_no>
		<journal_datetime>2007-04-08</journal_datetime>
		<journal_type>100</journal_type>
		<journal_hours>0</journal_hours>
		<journal_text>Beta</journal_text>
		<transaction_id>1006</transaction_id>
		<journal_source>200</journal_source>
	</journalentry>
</journaldata>


About the Time and cost journal entry input file XML format

Records must be enclosed in the journaldata xml tag

Each record must be enclosed in the journalentry tag

Each record must include at least the following tags:

question_no
and journal_amount and/or journal_hours

Details of the XML tags

  • The question_no field is mandatory, and the data in it must match a question number existing in your RefTracker database. Therefore the Question number must include the prefix eg RT1234. This tag specifies the RefTracker question number that the journal entry will be created in.- The journal_amount field should be numeric (no currency sign). If this field is present the amount will be loaded into the RefTracker journal record amount field. Negative values are allowed in order to provide a mechanism for correcting information that may have been accidentally bulk loaded more than once.
  • The journal_hours field can be in the format HH:MM or a numeral that represents the number of minutes. If this field is present, the time specified will be loaded into the RefTracker journal record time field. Negative journal_hours values are not allowed. Important note - journal_hours simply loads hours to the journal – in loading this data RefTracker will not do any automatic multiplication of journal_hours by staff member billing rate, as it cannot tell which staff member spent the hours. Therefore, if you want the cost of these hours to be loaded, you should also include a journal_amount tag that specifies the cost of the hours, in the same input record. Similarly, to correct for journal_hours loaded more than once, use a negative Journal_amount value to correct the impact of those hours.
    Optional XML tags that may be supplied are:
  • The journal_datetime field is used to specify the date on which the transaction specified by this record occurred. It must be in the format yyyy-mm-dd hh:mm where hh:mm is optional, however supply of time information is optional. If not included, it will default to the date and time at which the batch was processed.Note that the date and time specified here will be interpreted as having been specified in the time zone where your RefTracker server is located.
  • The journal_text field is used to attribute a description to the data being loaded. For example you might want to enter the name of the database that was searched to generate the cost being loaded e.g. "ABC database search". If not included, no description will be loaded.
  • The journal_type field is used to indicate the type of data being loaded. Valid values for this tag are the numerical codes displayed to the left of the entries in the "Journal type (time)" and "Journal type (cost)" code tables – they will have codes between 300 and 399 for time and 600 and 699 for costs.The Journal type (time and cost) code tables can be edited to provide other values that describe each of the different types of time and cost that your library can incur e.g. "database searching cost", "contractor cost", "company search fee", etc. Defining a Journal entry (time or cost) code table entry, will often remove the need to supply a Journal_text entry in your input file to describe the cost.Note that defining a journal type describes the data you are loading only - the batch loading process does NOT utilise any default values in these code tables. When using batch loading the amounts MUST be included in the records that you load.
  • The journal_source field is used to indicate the source of the cost or hours in the journal entry – in other words how they were loaded. When loading records using this Journal Entry Loading function, the value of this field MUST be 200 to indicate that they were loaded using the Journal entry Batch loading process. If this tag not included in the XML file, this value will default to 200 when a journal record is created via the loader – so there is no need to include this tag in your xml data record.
  • The transaction_id tag is not actually loaded into the RefTracker database but it can be used to provide a unique id for the record. We recommend use of unique (possibly sequential) numbers this tag as it makes it easier to track issues when there are loading problems.

The transaction_message - you should not include this tag in your data file – when appropriate, it will be inserted by the RefTracker Journal Entry Load function to indicate the reason why that record did not pass RefTracker Journal Entry Load function validation, and so could not be loaded to the RefTracker database. (When you correct the eror you should also remove this tag.)

Except for transaction_id and transaction_message, neither of which are loaded to the RefTracker database, the lines in this XML definition are appropriately named to indicate the fields of the RefTracker Journal Table, into which the data will be loaded.

An example input file, ready to load to RefTracker, might look like:

Code Block
themeEclipse
languagehtml/xml
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<journaldata>
	<journalentry>
		<journal_amount>60.60</journal_amount>
		<question_no>PROB1254</question_no>
		<journal_datetime>2007-09-02</journal_datetime>
		<journal_type>700</journal_type>
		<journal_text>Better database searching costs</journal_text>
		<transaction_id>0002</transaction_id>
		<journal_source>200</journal_source>
	</journalentry>
	<journalentry>
		<journal_amount>90.00</journal_amount>
		<question_no>PROB1242</question_no>
		<journal_datetime>2007-09-02</journal_datetime>
		<journal_type>600</journal_type>
		<journal_hours>1</journal_hours>
		<journal_text>Better database searching costs</journal_text>
		<transaction_id>0003</transaction_id>
		<journal_source>200</journal_source>
	</journalentry>
</journaldata>


But my input file is not in this format – it's not even in XML!

Manipulating data files in, or into, XML is easy.

If your file is in XML format, but just not in the format specified here, you can download a free XML editing/viewing tool from Microsoft at http://www.microsoft.com/downloads/details.aspx?familyid=72d6aa49-787d-4118-ba5f-4f30fe913628&displaylang=en

If your file is in .csv format, it is easy to both manipulate it and convert it to XML by using Microsoft Excel.

Here's an example of how to prepare a CSV file for use in RefTracker Time and cost journal entry Loading including conversion to XML:

  1. Open the .csv file in Excel
  2. Perform any data manipulations required to obtain the mandatory data fields.
  3. Select Data > xml > xml source from menu bar
  4. Click on XML Maps button
  5. Click the add button and browse for and select the file:
    reftracker/exchange/import/schema/journalentry.xsd
  6. Click OK - now you should see an XML schema tree
  7. Click on journal_datetime in the tree and drag to the first cell of the column that contains the date of the transaction (it might be called 'billing date' for example)
  8. Click on question_no in the tree and drag to the first cell of the column that contains the question number to which the journal should be applied (it might be called 'cost centre' for example)
  9. Click on journal_amount in the tree and drag to the first cell of the column that contains the cost to be applied to this question (it might be called 'Net amt' for example)
  10. Select Data > Xml > Export from the menu bar and save to a new file
  11. This new file is now ready for loading via the RefTracker loader (so it needs to be placed in the reftracker/exchange/import/journal entry/input directory


A helpful tutorial for using the Excel XML conversion feature can be found at

{+}http://www.codeproject.com/soap/ExcelListToXml.asp+

(but you can ignore the instructions about creating maps)

2. Processing details

When your data file has been provided in the required format, it needs to be loaded into the reftracker/load/journalentry/input directory. We recommend that files are uniquely named using a file name and date e.g. ABC20071102, so that the archive files for each processing run will also be uniquely named.

To process your input file, log into RefTracker as a System Administrator, click on System, then click on Batch processing, then click on Time and cost journal entry import.

A screen showing all files currently waiting to be loaded in the reftracker/load/journalentry/input directory displays, and the files to be loaded should be ticked before clicking on Submit




The Loader does an initial pass through the input file and reads to the end of the file but does not process any records. This pass is simply to verify that the file can be read. If there are any major XML irregularities they will be detected during this pass and a screen similar to the next screen print will display detailing the errors found.

The Loader then does a second pass – each logical record (bounded by the journalentry xml tag) is extracted. If the journal record passes validation, the data specified in it will be loaded as a new entry in the journal table of the specified question number, and the xml record will be saved to the archive file.

Results of the processing run are displayed in the following screen. In the example below, the red flag indicates not all records were successfully processed – a green flag indicates a fully successful runNote that you can view the files resulting from this processing run by clicking on the file names displayed in this screen. Clicking on the "Input file after processing" will show you any records that were not processed and the transaction_message tag that has been inserted to indicate why the record could not be processed. 




3. Error handling

If the record does not pass validation it will remain in the input file and will not be processed. An xml transaction_message tag will be added to the xml record containing a message indicating why the journal record did not pass validation.

A record is also written to the log file in the reftracker/load/journalentry/log directory as a file named <inputfilename>_<today's date>.txt.

After processing, the input file, now containing only the records in error, each marked with a transaction_message tag describing how the record is in error, can be edited to correct the identified problems and the file can then be processed again. When you correct the input file, you should remove the transaction_message tag that describes the error that you fixed. When the records process successfully, they will be appended to the same file in the /archive directory, that that records from the same input file were transferred to in earlier processing runs.

Note – during processing the records that are loaded are output directly to the archive file, the records that failed validation are output to a temporary file. At the end of processing, the original input file is deleted and the temporary file is renamed to have the same name as the original input file. If there is an unexpected error during processing, it is important that you check for the existence of the archive file, and manually adjust the input file to reflect the records that were processed, in order to ensure that records are not loaded twice (because the renaming process was not undertaken due to the unexpected error).

4. Output from the Time and cost journal entry loading function

Records processed by the RefTracker Time and cost journal entry Loading function create journal entries in the specified question number. They can be viewed in the Journal tab of the Summary screen and, as normal journal entries, will be automatically include in the costs and hours reported by the RefTracker Statistics reports (including the Data extract records). So that the Time and cost journal entry Loader generated entry is obvious in the example below it was loaded with a noticeably earlier date making it appear at the very bottom of the History tab of the Summary screen example below. 


 

5. Subdirectories in the RefTracker directory structure to support this function

reftacker/exchange/import/journalentry/input Place files for loading here

reftracker/exchange/import/journalentry/log Log files will be created here

reftracker/exchange/import/journalentry/archiveProcessed files placed here

reftracker/exchange/import/schema The files defining this process's XML schema

 

...

Zen Master
titleyes
labelsyes
use-other-page-as-masteryes
other-master-page-titleAdministration Training Manual
use-other-page-as-mastercommentsyes
masterown-master
commentsmetadatayes
breadcrumbsyes
metadatayes