1

TLDR

Have background in MLOps and machine learning engineering, started at a new employer (as the first AI engineer) and failed in a project of time series forecasting. Approach detailed below, any idea on what could I do better?

Original Task

As described by my non-technical boss (no background in machine learning), the goal is to find anomalies in a cost database in Big Query. No other detail, but he said as a senior engineer, I should figure out the specifics.

Fair enough, but the data has a dozen different cost attributes, at department level, individual customer level, account manager level, pre onboarding cost, post trading cost, resource cost etc. The domain iss kinda new to me, so initially, I was a bit flustered on figuring out what to model or find anomaly on.

Anyway, after about a week, I delivered an anomaly detection model and basic results the form of python scripts, notebooks, graphs and power point decks based on

  • my judgement and assumptions of what costs are relevant
  • what are the features to look at to identify the anomaly
  • future steps in how to push it to a production application, and make it accessible to the user (internal company users from other departments)
  • asking for feedback on my assumptions

The AI modelling part was trivially simple in itself. I also insisted on surfacing the basic ideas and results to the stakeholders in different departments (who would be the consumer/user) to get the domain feedback. But my boss kept giving relatively inconsequential (in my eyes) feedback (at visual level) like

  • show a pie chart here instead of bar chart
  • show the cost on a per department basis instead of account manager basis
  • show the median of past three quarters here etc.
  • incorporate a user specified threshold on some cost outlier data (it was all running a python script, so no user as such, but mocked by a setting a variable to a threshold)

and many others like this. The data is available on Bigquery, and anyone can create a view with groupby filtering etc. (and I did) but these had nothing to do with anomaly detection (just different ways to slice, dice and present the data), and went on a few times back and forth.

I mentioned several times something along the line of

If you have a specific requirement on the business logic, what kind of chart you want to see, which costs you want to model, or what you think is an anomaly, can you tell me?

The response was usually something like

You are an expert on ML, you should figure it out.

My General Workflow (after presenting the basic results and exploratory analysis)

Incorporated actionable feedbacks soon as they came (within two working days), documented the discussions, progress and the updated in a shared file and jira board to keep record. But my request to actually talk to the users on what they could find useful was ignored on several occassions with reasons like Jack is having a vacation, Bob is on a business trip, Joye is very busy etc.

Needless to say, somehow my boss got impatient with it, and I faced the axe.

So, the goal behind this post is not to seek sympathy, but on how would you approach the whole project (the expectation management+ the data+ the ambiguity). As I said, the raw technical task seemed simple enough, as is generating a few views on Big Query to see e.g. which department spent the most on so and so quarter etc.

So the concrete questions are

  • Do you think the project is an AI/ML project at all?
  • How would you gather the requirement in a more concrete manner against which you can deliver?

P. S. They do not even have a definition of anomaly in their mind. Initially, I used the spectral residual model (from microsoft, there is a paper on it) to define anomalies, but then they could not understand it. So I shifted to a simple Z-score (based on mean and standard deviation) based anomaly detection.

Della
  • 345
  • 1
  • 3
  • 9

0 Answers0