Wowee, this week was one of the hardest for me! While I have a pretty hefty statistics background, I’ve never used the Lomb-Scargle or Autocorrelation Functions before. I’d only heard them mentioned in passing and had no idea what they were actually doing. I’m glad I had this experience through DREU, I can’t imagine learning this in grad school when I’m juggling so many other responsibilities! But first, I should probably talk about why I’m doing this.
I mentioned earlier that the first chunk of my DREU will be spent building training data for a machine learning model. This is a step in that direction! It also gives me a good sense of the sheer amount of work that goes into making statistical models to pick out the information that machine learning makes so easy. I have to tell you–this week was a lot of work. It was fun and informative, but it was a lot. And every time I add a layer to help identify eclipses statistically, it adds to computation time. With the LS and ACF I now run through 60 light curves (remember, nearly 20,000 data points each) in roughly 10 minutes. In that 10 minutes, it generates a light curve and Lomb-Scargle and Autocorrelation Function periodograms. I’m not quite done with what I’ll likely need to develop a classifier (which will lead to a training set), but I’m getting there.
These functions want to identify periodic signals within the light curve but they do it in different ways. The Lomb-Scargle Periodogram applies a sinusoidal model at each frequency, min to max that you specify, and gives them a power rating. The higher the power, the more likely the periodic signal in the light curve has that frequency. In other words, it puts a sine wave on top of the light curve and judges how closely it matches the data. High power? There’s your frequency! And to get a period–which is more useful for astronomers–all you have to do is invert it. The Autocorrelation function is also trying to find a high-powered period, but it does it by taking a chunk of the data, based in a min and max period that you give it, and repeatedly putting it on top of the data while moving to the right. When you put the data on top of itself, it’s going to match perfectly and will give you ultimate power! But you won’t find a high power again until–and if–you find that the chunk you are checking against is resting on top of a similar chunk.
Now that I have these two functions, I just have one more to go before I can start finding patterns! Up next week: Box-Least Squares. Once again finding powerful periods in a different way.
Have a good weekend everyone!