Tag List

3 followers
0
Avatar

I'd like to try the Tag List feature. Let's say I have a flat text file. 

Would should I use for the file type when I upload it into Datameer?

Amin Torabi

5 comments

  • Avatar
    Jason Arrigo

    Hey Amin,

    If your flat text file has delimiters, you can use CSV/TSV. If that doesn't work well, have a look at the Documentation page for Types of Data Supported to figure out what kind of file type might fit best.

    For a tag list widget, you need a string column as the Word entry. The Value to measure the word entry must a number.

    To work with the raw values, when configuring the schema, you can just choose the column data types appropriately or they can be converted in a workbook as necessary.

    Thanks,
    Jason

    0
  • Avatar
    Amin Torabi

    Hi Jason, 

    Let's say I am interested to create a tag list of DJT inauguration speech! And I have the transcript in a txt format.

    What do I do next?

    What would be my data type when I am uploading?
    How do I create the word-value key pairs? Any idea?

    0
  • Avatar
    Pablo Redondo

    If its completely unstructured text (no real delimiters), then I would recommend to either use the "HTML File Type" format or use the regex format.

    The HTML format will import a large string of your text in one field, then you can use functions in the workbook to split your text based on words, spaces, etc. (TOKENIZELIST(),TOKENIZE()).

    The regex format would allow you to define a regex expression to parse the file during ingestion. 

    0
  • Avatar
    Pablo Redondo

    No worries!.. One more note, if the file is very large (e.g. GB) then I recommend to utilize the TOKENIZE() function in a different sheet referencing the field with the text. The reason being that it will split the text and create a new row for every word/paragraph etc. If you do it from the same sheet then the GB row will be repeated creating a too large of a file to render and write. Hope this makes sense.

     

    0
Please sign in to leave a comment.