Questions tagged [annotation]

35 questions
13
votes
5 answers

What are helpful annotation tools (if any)

I'm looking for tools that would help me and my team annotate training sets. I work in an environment with large sets of data, some of which are un- or semi-structured. In many cases there are registration that help in finding a grounded truth. In…
S van Balen
  • 1,364
  • 1
  • 9
  • 28
5
votes
2 answers

Inter-Annotator Agreement score for NLP?

I have several annotators who annotated strings of text for me, in order to train an NER model. The annotation is done in json format, and it consists of a string followed by the start and end index of named entities, along with their respective…
Adnos
  • 81
  • 4
4
votes
2 answers

Tool for annotation of images for semantic segmentation

I have been searching around for a software tool, that I can use for annotating images. More specifically I want to do annotation to be used for semantic segmentation, meaning I want to create masks. I want to be able to create training data for…
4
votes
1 answer

How to deal with annotation errors?

I know my annotators are not perfect, sometimes making mistakes. What would be the best way to deal with the annotation errors for my training data?
Edamame
  • 2,785
  • 5
  • 25
  • 34
4
votes
2 answers

Using doccano for Aspect Based Sentiment Analysis annotation

Currently looking for a good tool to annotate sentences regarding aspects and their respective sentiment polarities. I'm using SemEval Task 4 as a reference. The following is an example in the training dataset: it is…
3
votes
3 answers

How should labeled data from multiple annotators be prepared for ML text classification?

My specific question is how NLP data from multiple human annotators should be aggregated - though general advice related to the question title is appreciated. One critical step that I've seen in research is to assess inter-annotator agreement by…
3
votes
1 answer

Bias that makes annotators accept a prediction rather then coming up with a different label

Many annotation tools can speed up the classification of images (or other data) by providing a prediction of the correct label which the user can accept or correct. However, humans have a tendency to leave things as they are (Status Quo Bias) or…
moi
  • 131
  • 2
2
votes
1 answer

Manual Data Cleanup Tools

I am writing an ETL pipeline for geospatial data of the form place_name,address,longitude,latitude,id_linking_to_other_dataset As the last step in the pipeline, I would like to apply manual transformations submitted by reviewers. Some of these…
Hayden
  • 21
  • 2
2
votes
1 answer

How would you build a big production ready image training dataset from scratch?

How would you most likely create a large production ready image training dataset from scratch including annotations for a image classification task? We will take a large amount of images (~1 million) with industrial cameras and save them in a S3…
Basti
  • 21
  • 3
2
votes
1 answer

Online Audio annotation tools

I need to find a decent online annotation tool to transcribe audio. There are some requirements for a potential tool: I should be able to deliver audio files to a few labelers. I should be able to track which files went to which labeler. It should…
Aidos
  • 123
  • 3
2
votes
1 answer

Are there any open-source text annotation for multi label classification tools?

I have a large texts in each document and I want to know if there are any open source text annotation tools available online for multiple label annotation. Each sentence takes two labels. If there are any please let me know.
user_12
  • 347
  • 3
  • 10
2
votes
1 answer

How to train a neural network for high recall?

I would like to train a neural network for named entity recognition to tag an unlabeled dataset of texts. The generated labels will then be checked via a crowdsourcing platform. The goal is to annotate the dataset. Therefore, the neural net should…
1
vote
1 answer

Identify outliers for annotation in text data

I read the book "Human-in-the-Loop Machine Learning" by Robert (Munro) Monarch about Active Learning. I don't understand the following approach to get a diverse set of items for humans to label: Take each item in the unlabeled data and count the…
Mykola Zotko
  • 77
  • 2
  • 14
1
vote
1 answer

For obejct detection, should I resize my custom images first and then start the annotation or it won't matter?

I have my custom dataset images of size (1080 x 1920) and I am trying to use yolov3 for object detection. I noticed that yolov3 model accepts an input image size of 416 x 416. So I am in confusion if I should resize the image and apply zero-padding…
1
vote
0 answers

Corpus suggestion for financial domain

I am looking for a financial corpus or any form of publicly available financial texts which is replete with technical terms and acronyms. Any suggestion is appreciated.
user3070752
  • 111
  • 2
1
2 3