Week 9

This week I’ve added 3 more iterations, not a lot more compared to last week but these iterations are deeper. I noticed that clipping wasn’t clipping the right amount of data. I’m still working on it, and this is the trickiest line of code I’ve ever had to work with. There’s no reason that I can see that the data wouldn’t be clipping like it should. It’s frustrating! But that’s how it goes.

Right now the classifier is identifying light curves on several thousand randomized TESS light curves, so I’ve moved beyond the test data and am testing the real thing. So far it’s flagging 1% of the light curves as eclipsing binaries, and that’s about half of what we’d expect to see. Still, when only 2% of your data set is the “true” class, finding half of them is a BIG deal! And it’ll make a future machine learning model work nicely.

Still, there are some false positives popping up, but not as many as before. I think once I sort out clipping, I’ll be able to cut this in half. At the very least, there are only a few dozen being picked up for a few thousand light curves. That’s not bad! (And ML models need to train on false examples that look similar to true examples anyway! Win win!)

Written on August 14, 2020