|
|||||||||||||||||||||||||||||
|
A Rapid Introduction to the AMI Dataset Search Interface.The full AMI tutorial takes about 90 minutes to 2 hours. This page is meant to give you a rapid introduction to all the functions of the interface without going into details.ContentsWhat is AMI?Where does AMI get its Information?What is an ATLAS dataset?How do I search for a dataset?Which information can I get from the result of an AMI dataset search?What is the schema of the AMI dataset catalogue?Why can I sometimes not find a dataset when I can see its existence in other catalogues?Can I refine the search?Can I simply browse all of the information in AMI?Can I bookmark an AMI page?Why doesn't the back button of my browser work?Can I use AMI without going through the web interface?How can I extract information from AMI?How to I write to AMI?What is AMI?AMI is a generic cataloging system. It uses self describing databases to construct queries and web interfaces over catalogues with differing structures. Queries are sent in parallel to many databases. The results are returned to the web interface catalogue by catalogue. A drop down combo box at the top of the results page allows you to navigate to any of the catalogues were a result was found. You can find additional information on the organization of data in AMI is found here.Where does AMI get its information?Almost all data is pulled into AMI by special tasks. Real data is uploaded from the Tier 0 database after a trigger is set by Tier 0 management. Details of Monte Carlo datasets , and reprocessing of real data are pulled from the task request system, and correlated with information read from the production DB. Additional information comes from the Monte Carlo dataset number catalogue. It is also possible to update AMI using python scripts. The POOL conditions dataset catalogue is updated in this way. Additional information about data sources What is an ATLAS dataset?It is not possible to put all the data from one run into one file. ATLAS uses aggregations of files containing events. These aggregations are called datasets. The distribution of ATLAS data introduces some additional complications. All the events for one particular set of simulation conditions is not necessarily produced all on one site, or all from one production task. Typically the first task will only produce a few events, and when everything works, more events will be produced by another task or tasks. Some datasets are the result of more than 20 succesful tasks. The set of files produced by one production task is known wirthin ATLAS as a "tid dataset" because they are named with a suffix "_tidnnnnnn" where nnnnnn is the task number. A dataset which is useful for physics contains in general several tid datasets. The distributed data management manipulates these datasets using a concept of containers. The container name is the physics dataset + a slash suffix.The datasets which AMI catalogues are the logical equivalent of the DDM container datasets, but AMI does not add a slash as the last character of every name. How do I search for a Dataset?There are several ways to search for a dataset.The simple search does a partial string search on the dataset name. For example if you are looking for a dataset with " Jimmy" in the name then just type " Jimmy" in the search box. If you do not know if the property you are looking for is in the name, then you can try a keyword search. This will look in several fields for the string or strings you have typed in the box. It is much faster to search on the name of the dataset. The character " %" is used for wildcarding. If you do not type a % character in your string then AMI will add a leading and trailing % character, so that the search will be in fact a partial string search. On the other hand, if you do type a % character in your search string AMI will not add leading and trailing % wildcards. So if you want to find a datasets whose name starts with "csc" then type "csc%" in the search box. If you type a space character in a keyword search then the search will be an "or" on both parts of your string. The advanced search allows you to select from a certain number of pre-defined parameters, using drop down boxes. It is much faster than the keyword search, so you will get a faster result if you only want to search on this subset of parameters. This interface was designed with the help of the ATLAS usability task force. What information can I get from the result of an AMI dataset search?The first set of results you will obtain contains a subset of all the information we hold about the datasets. Notice that you can browse through all the results, changing the default number displayed, reordering them, and even editing which fields you want to look at, and the order in which they are displayed.A click on the spread sheet icon will show you the set of values available in the column and the numbers of each value. Some columns are in blue - which means that when your mouse passes over them you will see an explanation of the column name. Hopefully the sub set of information is enough for you to decide if you want to know more about the dataset. If you do you can click on the "details" link to the left of the dataset name. Then the interface will bring up all the information we hold, including the information in sub tables. For example, if the dataset is a Monte-Carlo dataset, you should see that there is some information about the production tasks available, in a table called prodsys_task (or event_range, for older catalogues). To find out more about navigating around the results go to the full tutorial. What is the schema of the AMI dataset catalogue?AMI does not have either a single catalogue or a single schema. Datasets are stored in several catalogues, and each catalogue can have a different schema. The self-description contained in each AMI catalogue allows the web interface,and the searches, to adapt. This allows us to support schema evolution, and to be backwardly compatible. The very first catalogue which was made in AMI for the LAr Test Beam is still searchable from our latest web interface!.At present the decision about the placement of a dataset record in a catalogue is made implicitly by the data puller tasks, by a mapping of the series name (the first part of the dataset name) to the catalogue name. Input using a pyAMI command requires an explicit catalogue name to be given. Why can I sometimes not find a dataset when I can see its existence in other catalogues?In general AMI will not register an ATLAS dataset unless it is previously registered in DQ2. But for several reasons AMI does not automatically register all DQ2 registered datasets.You can only expect to find official ATLAS datasets in AMI. Even some of these may not be visible by default. This is because we have a dataset trashing mechanism. If a dataset is known to be bad it is flagged as TRASHED and not shown in the dataset search unless the user specifically asks for it. This can be done using the advanced search interface (just untick the exclusion box). Datasets can be trashed either AUTOMATICALLY when the updating task notices that all production tasks related to the dataset have been aborted, or EXPLICIT. EXPLICIT trashing may be done by production managers for example when it is known that the data has been removed from all storage elements. Can I refine the search?Yes you can, there are two ways, either by using the "group by" and/or the "magnifying glass" icons from your result set, or by going to the complete "refine Query" interface. But "refine Query" is not trivial, unless you have some SQL experience. If you want to try this we recommend that you first run through the full tutorial.Can I simply browse all of the information in AMI?Yes you can. Go to your dataset selection home page (this is the one you get to from the AMI portal if you choose ATLAS> dataset selection in the top horizontal menu). Then from your home page, select "Databases" from the menu.Can I bookmark an AMI page?The technology we use does not allow you to directly bookmark pages because the parameters are hidden. But we do provide a way to make personal bookmarks. They will then show up in your personal AMI home page.This one shows all the valid datasets in the csc catalogue produced by OSG . To make bookmarks you must be a registered user. To find out more please go to the full tutorial. We also have a system to allow you to embed AMI queries in your web pages. See here for more information and here for an example Why doesn't the back button of my browser work?This is a common problem of web sites which use advanced web technology. The result obtained from the browser back button seems to be unpredictable. We have implemented our own "back" system or bread crumb trail. As you browse the results of an AMI dataset search you will see a series of "back" buttons appearing which remember your path. So you can click on them to get back to where you were. you can also remove those you do not need anymore. Please see here for more information.Can I use AMI without going through the web interface?Yes you can. The technology we use is based on a system of commands. AMI commands all have an XML output. Commands are passed on your behalf by the web interface, and the results are formatted using XSLT. You can also pass the commands directly to our web service in several ways. We have provided wrappers to some AMI commands, but any command can be passed without a specific wrapper. Try clicking on theCan I extract information from AMI?Yes you can. Notice the "Export" item in the catalogue level menu of the dataset search result. Four possibilities are offered.
Note that you can choose to export ALL the results of your query or just those which are currently SELECTED on the web interface. This means that is your query has returned a large number of results you do not have to try to display them all before exporting. How to I write to AMI?AMI has a hierarchical system of managing user rights. It is possible to give a user rights to update or insert information in selected catalogues. Please send us an email to get more information about how to write to AMI.last update: 2009-05-27 |
||||||||||||||||||||||||||||