ICARUS-Search-Perspective

The search_perspective.png perspective provides the following search types:

Back to ICARUS Main PageBack to ICARUS Main Page

Index:

  1. How to set up a new search

  2. Search Menu

  3. Result Outline

  4. Dependency-Search:

    1. Search Parameter (Dependency-Search)

    2. Graph Query Editor (Dependency-Search)

    3. Result Outline (Dependency-Search)

  5. Error Mining:

    1. Search Parameter (Error Mining)

    2. Error Mining Query Editor

    3. Result Outline (Error Mining)

  6. Tutorials (including videos):

    1. Tutorial Dependency Search (passive constructions) with one grouping operator

    2. Tutorial Dependency Search (passive constructions with overt logical subjects)

    3. Tutorial Dependency Search (passive constructions with overt logical subjects and object)

    4. Tutorial Error Mining

I. How to set up a new search:

  1. Click on search_new.png to create a new search.

  2. Afterwards the search need to be configured:
  3. search_configuration.png

    • Type: Select the desired search mode (dependency, error mining, coreference,...)
    • Data-Set: Select the Treebank/Document
    • Query: Clicking search_query.png opens the query editor. There may be different types of query editors depending on the search type.

    • Parameters: Search pararameters depending on the search type.
  4. Execute Search using the search_execute.png button

  5. View the Result by double-clicking the search result or use the inspect-button search_inspect.png

Back to indexBack to index

II. Search Menu:

search_manager_menu.png

Search History Toolbar: search_history-tb.png . Every executed search is listed in the search history. The history is available until you close your ICARUS session. The figure shows three search history items. During the search process the icons to the left may change:

Back to indexBack to index

III. Result Outline:

attachment:search_result_1D.png

Back to indexBack to index


  1. Search Parameter (Dependency-Search)

  2. Graph Query Editor (Dependency-Search)

  3. Result Outline (Dependency-Search)

Back to indexBack to index

Search Parameter (Dependency-Search):

Back to indexBack to index

Graph Query Editor (Dependency-Search):

search_query-editor-tab.png This tab is used to build a query. Graph Editor Toolbar: search_graph-tb.png

Note: The copy&paste nodes/edges can be used to copy graphs from/into other perspectives (e.g. Tutorial 1D,..)

Text Query Editor Toolbar: search_query-tb-text.png

Back to indexBack to index

Result Outline (Dependency-Search):

search_result-tab.png Use this tab to browse the search results. The visialization may be seperated into four differnet presentation styles. We describe the different types in the following section.

Result Outline Toolbar: search_result-base-tb.png

0. No grouping operator search_grouping-operator.png is used.

The result is presented as a list of sentences. Every occurence that matches the query is colored blue. Results (0D) attachment:search_result_0D.png

1. One grouping operator search_grouping-operator.png is used.

All lemma types found are shown in the list (red) to the left. The user may select one lemma type to get all instances with matching query. Every occurence that matches the query is colored blue and the "grouped" lemma colored red. Results (1D) attachment:search_result_1D.png

2. Two grouping operators search_grouping-operator.png are used.

The result is presented as a table. Grouping operator one (red) is on the y-axis and grouping operator two (green) on the x-axis (Note: The x-/y-axis may be fliped clicking on search_flip-table.png ). Every occurence that matches the query is colored blue. Results (2D) attachment:search_result_2D-a.png attachment:search_result_2D-b.png

3. Three grouping operators search_grouping-operator.png are used.

The result is presented as a list of sentences. Every occurence that matches the query is colored blue. Results (3D) attachment:search_result_3D-a.png attachment:search_result_3D-b.png

At the lower part of the graph panel is the text outline. The list contains all search results of the selected instance. The selected sentence is shown in the graph panel.

Toolbar: text-tb.png

Back to indexBack to index


V. Error Mining

  1. Search Parameter (Error Mining)

  2. Error Mining Query Editor

  3. Result Outline (Error Mining)

To detect sequence annotation errors within part-of-speech tags we implemented the algorithm introduced by Dickinson and Meurers (2003) [1]. Additionally for structured annotations we choose the approach shown in Boyd et al. (2008) [2] that targets inconsistency within dependency structures.

We designed and built a graphical user interface (GUI) that is easy to handle and user-friendly. Implementing state-of-the-art algorithms for error detection with an user-friendly interface increase the operation domain because the algorithms can be used by a wider audience without deeper knowledge of computers. It provides even non-expert users with the capability to find inconsistent pos tags and dependency structures within a corpus.

[1] Dickinson, M. and Meurers, W. D. (2003). Detecting errors in part-of-speech annotation. In Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL-03), pages 107–114, Budapest, Hungary.

[2] Boyd, A., Dickinson, M., and Meurers, D. (2008). On detecting errors in dependency treebanks. Research on Language and Computation, 6(2):113–137.

Back to indexBack to index

Search Parameter (Error Mining):

Back to indexBack to index

Error Mining Query Editor:

search_query-editor-tab.png This tab is used to build a query. A single query item contain of the following parts:

  1. Include Tag (boolean) = All tags that are ignored (Include Tag=true) are mapped onto a special "ignoredtag"-subclass. This option has priority over the new tag definition.

  2. Tagclass (string) = If the current tag matches the Tagclass it may be included or assigned with a new Tag (if speficied)

  3. new Tag (string) = The new tag for all tags that have a matching Tagclass within the query list specified in ii.)

If the current tag is not found within the query list it is neither ignored nor does it get a new tag assigned and the algorithm just continues the normal way taking the current tag. The benefit of this design is that there is no need to put the whole tag-set into the query system.

The Error Mining Query Editor provides the functionality to group tags together, rename tags or exclude tags from the search. It is organized in three parts attachment:search_qe-errormining-view.png. On the left side there are buttons to create/edit or delete a single query:

In the middle there is an overview over all specified queries represented as a list. attachment:search_qe-errormining-list.png

Below are three buttons to manage the ngram query item list:

The capability of saving a query to an extensible mark-up file (xml) and load it again later is useful if the user specifies a query and wants to use it later in different corpora. Using reset will delete all specified query items.

Back to indexBack to index

Result Outline(Error Mining):

search_result-tab.png Use this tab to browse the search error mining results. ICARUS provides two views for browsing the potential errors. The search_variation-ngrams.png view shows a list of all variation n-grams found whereas the second view search_label-distribution.png shows label distribution over word forms.

Result Outline Toolbar: search_result-em-tb.png

Variation N-Gram View (Error Mining):

attachment:search_result_em-pos-variation.png

Variation N-Gram Toolbar search_em-variation-tb.png

Each variation entry has the following format "Listindex) n-gram-length Occurence-Count ngram"

Example n-gram: search_em-single-result.png .

When the user selects one n-gram additional information about the nucleus (part-of-speech tags, tagcount) is displayed below the list. To inspect the result the user may double click on an entry from the variation n-gram lis. In the example he would recieve all sentences with the nucleus "'s" (POS, VBZ and NNP) clicking on search_em-single-result.png

If he is only interested in instances where "'s" was tagged as VBZ first he have to select the n-gram in the list and anfterwards double click on one of the lines in the lower part of the window search_em-single-tag.png that contain that particular combination of word form and part-of-speech tag. Each time the user clicks on a n-gram, a new tab will be created, allows the user to jump back to previous results without having to recreate them (run the search again).

Label Distribution View (Error Mining):

attachment:search_result_em-pos-distribution.png

Variation Label Distribution Toolbar search_em-distribution-tb.png

On the left a list of unique label combinations is shown. Selecting one displays a list of word form that occur with exactly these tags in the corpus. This list is below search_result-em-label-dist-b.png . To the right the frequencies of the different labels are shown in a barchart. The left-most bar (here red) for each label always shows the total frequency. The user may select more words froms from the list to add additional bars to the chart that show the frequencies for eacht selected word form.

Results Presentation:

attachment:search_result_em-pos-distribution.png

Back to indexBack to index


VI. Tutorials (including videos)

  1. Tutorial Dependency Search (passive constructions) with one grouping operator

  2. Tutorial Dependency Search (passive constructions with overt logical subjects)

  3. Tutorial Dependency Search (passive constructions with overt logical subjects and object)

  4. Tutorial Error Mining

Back to indexBack to index

1) Tutorial Dependency Search (passive constructions) with one grouping operator:

Video Download:

If the the user doesn't exactly know the how passive constructions are annotated in a treebank. Then he can use e.g. mate-tools or weblicht to parse a sentence contains a passice construction and copy&paste the structure to the search graph.

  1. Parsed sentence "Mary was kissed by a boy." search_example_mt.png .

  2. Select the passice construction search_example_mt_selected.png

  3. Copy the selected cells and edges search_copy.png and switch to the search_perspective.png

  4. Paste selected cells and edges into the search query editor window search_paste.png

  5. The resulting graph when using the arc-layout (recommended) search_arc-layout.png search_cp-graph-arc.png

  6. In the following step the search graph (query) will be generalized (double clicking the edge / nodes to open the edge/node editor).
    1. Node 1 properties search_edit-node.png changed to search_edit-node-b.png

    2. Edge properties search_example-edge.png changed to search_example-edge-b.png

    3. Node 2 properties search_example-node2.png changed to (added grouping operator <*>) search_example-node2-b.png search_example-node2-c.png

    4. These changes result in a new more generalized version of the search graph (below is the textual query representation) search_example_sg+text.png This query matches passive constructions in English as annotated in the CoNLL08 Shared Task data set.

  7. Results (1D) attachment:search_result_1D.png

Back to indexBack to index

2) Tutorial Dependency Search (passive constructions with overt logical subjects):

Video Download:

We are interested in passive constructions with overt logical subjects, grouped by lemma of the verb and the lemma of the logical subject. We may use the search graph for passive constructions or build the query completly manually (shown here).

  1. First of all clear the graph editor panel (if there is any remaining graph) using search_clear.png

  2. Add four new nodes search_add-node.png you may "automatic reorder" them by clicking search_reorder-graph.png

  3. Your graph editor should look like search_t2_4nodes.png

  4. There are two ways connecting nodes / adding edges
    1. Select two nodes search_t2_addingedge-a.png and connect them clicking on search_add-edge.png

    2. Place the cursor in the middle of the desired (source) node. A green border will show up search_hl-node.png . Hold the left mousebutton and move to the (target) node. When you reached the target node again a green border shows up. Release the left mousebutton to draw an edge between those node search_t2_addingedge-b.png

  5. Double click on the nodes/edges to specify the constraints. (Note: Adding constraints may mess up the graph layout. You may use search_reorder-graph.png to redraw the graph)

    1. Node 1: Lemma = be search_t2-n1.png

    2. Node 2: Lemma = <*> (red grouping operator); Part-Of-Speech = VBN search_t2-n2.png

    3. Node 3: Form = by search_t2-n3.png

    4. Node 4: Lemma = <*> (green grouping operator) search_t2-n4.png

    5. Edge 1: Relation = VC search_t2-e1.png

    6. Edge 2: Relation = LGS search_t2-e2.png

    7. Edge 3: Relation = PMOD search_t2-e3.png

  6. When every node, edge was linked and there was no error setting the constraints above the search graph should look like this: search_t2-sg.png

    • (Textual query: [lemma=be [relation=VC, lemma<*>1, pos=VBN [relation=LGS, form=by [relation=PMOD, lemma<*>2]]]])

  7. Results (2D) attachment:search_result_2D-a.png attachment:search_result_2D-b.png

Back to indexBack to index

3) Tutorial Dependency Search (passive constructions with overt logical subjects and object):

Video Download:

In tutorial 1) we showed how to create a query using a copied graph from the parser. Tutorial 2) shows how to create a query from scratch. In tutorial 3) we will extend the search graph used in 2) with an additional grouping operator.

  1. We start with the following search graph search_t2-sg.png

  2. Add one new node search_add-node.png you may "automatic reorder" them by clicking search_reorder-graph.png

  3. Your graph editor should look like search_t3-n5added.png

  4. Connect the "red" node with the new node using one of the following options
    1. Select the node search_t3-addedge-c.png and connect them clicking on search_add-edge.png

    2. Place the cursor in the middle of node 2. A green border will show up search_t3-addedge-a.png . Hold the left mousebutton and move to the new node. When you reached the target node again a green border shows up search_hl-node.png . Release the left mousebutton to draw an edge between those node search_t3-addedge-b.png

  5. Double click on the new node/edge to specify the constraints. (Note: Adding constraints may mess up the graph layout. You may use search_reorder-graph.png to redraw the graph)

    1. Node 5: Lemma = <*> (browngrouping operator) search_t3-n5.png

    2. Edge 4: Relation = OBJ search_t3-e4.png

  6. When every node, edge was linked and there was no error setting the constraints above the search graph should look like this: search_t3-sg.png

    • (Textual query: [lemma=be [relation=VC, lemma<*>1, pos=VBN [relation=LGS, form=by [relation=PMOD, lemma<*>2]][relation=OBJ, lemma<*>3]]])

  7. Results (3D) attachment:search_result_3D-a.png attachment:search_result_3D-b.png

Back to indexBack to index

4) Tutorial Error Mining:

Video Download:

Back to indexBack to index

extern/ICARUS-Search-Perspective (last edited 2014-04-25 12:09:32 by GregorThiele)