.. meta:: :description: Orange Textable documentation, mining Humanist illustration :keywords: Orange, Textable, documentation, illustration, Humanist Illustration: mining Humanist ============================= The following example is meant to show *what* Orange Textable typically does, without considering (for now) every detail of *how* it does it. In a paper reflecting on terminology in the field of Digital Humanities [#]_, Patrik Svensson compares the evolution of the frequency of expressions *Humanities Computing* and *Digital Humanities* over 20 years of archives of the `Humanist discussion group `_. He uses these figures to show that while the former denomination remains prevalent over these two decades, the latter has been quickly gaining ground since the 2000s. The same experiment can be run with Orange Textable, by building a "visual program" like the one shown on :ref:`figure 1 ` below: .. _illustration_fig1: .. figure:: figures/mining_humanist_schema.png :align: center :alt: Mining Humanist with an Orange Textable schema :scale: 80% Figure 1: Mining Humanist with an Orange Textable schema. Such a program is called a *schema*. Its visible part consists of a network of interconnected units called *widget instances*. Each instance belongs to a type, e.g. :ref:`URLs`, :ref:`Recode`, :ref:`Segment`, and so on. Widgets are the basic blocks with which a variety of text analysis applications can be built. Each corresponds to a fundamental operation, such as "import data from an online source" (:ref:`URLs`) or "replace specific text patterns with others" (:ref:`Recode`) for example. Connections between instances determine the flow of data in the schema, and thus the order in which operations are carried on. Several parallel paths can be constructed, as demonstrated here by the :ref:`Recode` instance, which sends data to :ref:`Segment` as well as :ref:`Count`. Widget instances can (and indeed must) be individually parameterized in order to "fine-tune" their operation. For example, double-clicking on the :ref:`Recode` instance of :ref:`figure 1 ` above displays the interface shown on :ref:`figure 2 ` below. What this particular configuration means is that every line beginning with symbol ``|`` or ``>`` (**Regex** field) should be replaced with an empty string (**Replacement string**): in other words, remove those lines that are marked as being part of a reply to another message. There is a fair amount of variation between widget interfaces, but regular expressions play an important role in many of them and Orange Textable's flexibility owes a lot to them. .. _illustration_fig2: .. figure:: figures/mining_humanist_recode.png :align: center :alt: Interface of Recode widget in the Humanist example Figure 2: Interface of the :ref:`Recode` widget. After executing the schema of :ref:`figure 1 ` above, the resulting frequencies can be viewed by double-clicking on the **Data Table** instance, whose interface is shown on :ref:`figure 3 ` below. On the whole, these figures lend themselves to the same interpretation as that of Patrik Svensson, but they differ wildly from the frequencies he reports. This might be explained by the fact that, in the present illustration, we have used *preprocessed* data `made available on the Humanist website `_, or it might be that we have not processed the data exactly like Svensson did. The user can always refer to the Orange Textable schema (including the parameters of each instance) to understand exactly the operations that it performs. [#]_ In this sense, Orange Textable does not only attempt to make the construction of text analysis programs easier; it aims to make *communicating* and *understanding* such programs easier. .. _illustration_fig3: .. figure:: figures/mining_humanist_results.png :align: center :alt: Monitoring the frequency of two expressions over time Figure 3: Monitoring the frequency of *Humanities Computing* vs. *Digital Humanities*. .. [#] Svensson, P. (2009). Humanities Computing as Digital Humanities. *Digital Humanities Quarterly 3(3)*. Available `here `_. .. [#] The schema can be downloaded from :download:`here `. Note that two decades of Humanist archives weigh dozens of megabytes and that retrieving these data from the Internet can take a few minutes depending on bandwidth. Please be patient if Orange Textable appears to be stalled when the schema is being opened.