Questions tagged [research]

72 questions
14
votes
3 answers

Why does everyone use BERT in research instead of LLAMA or GPT or PaLM, etc?

It could be that I'm misunderstanding the problems space and the iterations of LLAMA, GPT, and PaLM are all based on BERT like many language models are, but every time I see a new paper in improving language models it takes BERT as a based an adds…
Ethan
  • 243
  • 1
  • 2
  • 6
8
votes
1 answer

Which of the NIPS 2014 papers are most significant, and why?

As a newcomer to the field, I find many of the NIPS 2014 papers fascinating, but it is difficult for me to evaluate which ones represent real progress over current approaches. Which papers do you think are most significant and are likely to have a…
7
votes
1 answer

Why use mean revenue in a split test?

I asked a data science question regarding how to decide on the best variation of a split test on the Statistics section of StackExchange. I hope I will have better luck here. The question is basically, "Why is mean revenue per user the best metric…
Keith
  • 326
  • 2
  • 14
5
votes
4 answers

Where can I find resources and papers regarding Data Science in the area of Public Health

I'm quite new to Data Science, but I would like to do a project to learn more about it. My subject will be Data Understanding in Public Health. So I want to do some introductory research to public health. I would like to visualize some data with the…
5
votes
4 answers

What kind of regression model should I do?

my research question is the examine the effect of "receiving attention" from other members in an online community on "sustained participation" on the website. I decided to measure "sustained participation" of each user by calculating average time…
user27954
  • 51
  • 1
4
votes
5 answers

Classification training using probabilites and not raw classes (factors)

I have a problem where instead of having classes, i.e. a vector of 0s and 1s, I have the probability of an observation belonging to a class. A vector with 0.1, 0.95, 0.2, 0.3, etc. The obvious approach is using regression and it works relatively…
wacax
  • 3,500
  • 4
  • 26
  • 48
4
votes
1 answer

Resources for Promotion/Demotion Strategies for ML Item Recommendation Systems?

We are looking to design a system where specific items or categories of items can be boosted/promoted up or relegated/demoted down the recommendation order. What are the common strategies or standards to do such? A cursory google search did not…
4
votes
1 answer

Zero-shot learning for tabular data?

Can anyone point me to methods for zero-shot learning on tabular data? There is some very cool work being done for zero-shot learning on images and text, but I'm struggling to find work being done to extend these techniques to tabular data.
4
votes
1 answer

Determining completeness of dataset

I'm hoping you have some research or experience with determining the completeness of a data set. I'm trying to use a twitter dataset I scraped myself and want to have an indication on the completeness. Obviously, I will miss some data but I am…
3
votes
1 answer

How to determine the abnormality of a specific variable by taking into account all the other variables in the data?

I have an issue of machine learning/anomaly detection. Indeed, I have a variable Y and several other variables X. The purpose is to quantify the degree of abnormality of the data on Y but I have to take into account the values on the other variables…
3
votes
1 answer

Identify given patterns in unstructured data like text files

I wasn't sure if I had to ask it here or in Stackoverflow, but since I am also seeking research papers/algorithms and not only code, I decided to do it here. When I have a text, I can manually write a regex to find all the possible outputs from what…
Tasos
  • 3,960
  • 5
  • 25
  • 54
3
votes
1 answer

Need an advice on research topic

I am about to choose ML research topic for my master thesis, but i am at a dead end. The problem is, that while reading research papers, i find solutions, but not an open problems. For now, a came up with such ideas: Research neural network…
3
votes
1 answer

How can there be more true positive than positive?

Currently reading Learning from Little: Comparison of Classifiers Given Little Training In 3 Experiment Results, the following graph is shared: The experiment is described as follow We begin by examining an example set of results for the average…
3
votes
2 answers

What is the loss function defined by Mnih and Hinton in their paper “Learning to Label Aerial Images from Noisy Data”?

In section 3.3 of the paper, they state that they use the cross entropy. Then they define the probability for a label to be a false positive as $\theta_0$ and a false negative as $\theta_1$. They use it to somehow modify the loss function but never…
Borbag
  • 141
  • 6
3
votes
2 answers

How to measure Entity Ambiguity?

When using/building a system for Entity Linking, is there a well-known measure for "ambiguity degree" of an entity? Some approach to compare named entities regarding how difficult to disambiguate?
1
2 3 4 5