Build a concordance

Goal

Build a concordance to examine the context of occurrence of a given string.

Prerequisites

Some text has been imported in Orange Textable (see Cookbook: Text input) and possibly further processed (see Cookbook: Segmentation manipulation).

Ingredients

Widget Segment Context
Icon segment_icon context_icon
Quantity 1 1

Procedure

Widgets used to build a concordance and their interfaces

Figure 1: Widgets used build a concordance and their interfaces

  1. Create an instance of Segment and an instance of Context on the canvas.
  2. Drag and drop from the output connection (righthand side) of the widget instance that emits the segmentation in which occurrences of the query string will be retrieved (e.g. Text Field) to the Segment widget instance’s input connection (lefthand side).
  3. Also connect both the Text Field instance and the Segment instance to the Context instance (thus forming a triangle).
  4. Open the Segment instance’s interface by double-clicking on its icon on the canvas and type the string whose context of occurrence will be examined in the Regex field (here: hobbit); assign it a recognizable Output segmentation label, such as key_segments for instance.
  5. Click the Send button (or make sure the Send automatically checkbox is selected).
  6. Open the Context instance’s interface by double-clicking on its icon on the canvas.
  7. In the Units section, select the segmentation that contains the occurrences of the query string (here: key_segments) using the Segmentation drop-down menu.
  8. In the Contexts section, choose Mode: Containing segmentation and select the segmentation that contains the original text (here: text_string, as emitted by the Text Field instance) using the Segmentation drop-down menu.
  9. Tick the Max. length checkbox and set the maximum number of characters that should be displayed on either side of each occurrence of the query string.
  10. Click the Compute button (or make sure the Compute automatically checkbox is selected).
  11. A table showing the results is then available at the output connection of the Count instance; to display or export it, see Cookbook: Table output.

Comment

  • In the Regex field of the Segment widget you can use all the syntax of Python’s regular expression (cf. Python documentation); for instance, if you wish to restrict your search to entire words, you might frame the query string with word boundary anchors \b (in our example \bhobbit\b).