Download

 

For execute the system you need:

  • Download IRASubcat zip IRASubcat.zip or the tar version IRASubcat.tar

  • Uncompress the file. It is going to create a folder named IRASubcat.
  • Put your corpus(for example corpus.xml) into the folder IRASubcat. It need to be XML file in UTF-8 format.

  • Identify morphosintactic characterist and value for verbs into the corpus (for example sint=V).

  • Identify the element XML father of element which have the characteristic to study (for example sentence).

  • Identify characteristic or lexical items that do you want to have the values in the output dictionary for keys (example lexical or lema).

  • If you have a existing dictionary, verbal list, or configuration file put them into the folder IRASubcat. If you don't put a config.cfg the system is going to run with the default configuration. If you are going to consider a list of verb, you need to put a file with them. They need to be UTF-8 format.

  • and execute in comands line with the promp inside of IRASubcat folder:

    >python IRASubcat.py corpus.xml sint=V sentence lexical (with corpus.xml example 1,2)

    >python IRASubcat.py corpus.xml sint=V sentence lema (with corpus.xml example 2)

    >python IRASubcat.py corpus.xml sint=V sentence lexical config.cfg (with corpus.xml example 1,2)

    >python IRASubcat.py corpus.xml sint=V sentence lema config.cfg (with corpus.xml example 2)

             Example corpus.xml (1)

    <corpus>

      <sentence ID='1'>

        <word>They</word>

        <word sint='V'>are</word>

        <word sint='V'>playing</word>

      </sentence>

      </corpus>

              Example corpus.xml (2)

            <corpus>

          <sentence ID='1'>

            <word lema='they' sint='pron'>They</word>

            <word lema='be' sint='V'>are</word>

            <word lema='play' sint='V'>playing</word>

          </sentence>

           </corpus>

  •   Optional verbal list. A verbal list need to be a file UTF-8 like it:

    • verb1

    • verb2

    • verb3

 

 

Design by IRASystems