I published v0.2, which includes a bunch of hacked together proof of concept code. This week I dove into the pandas library, focusing on the data manipulation tools that are inherit to pandas. This post covers some of the highlights and demonstrates the initial charting that I’ve been producing.
More to come on what I plan to do with the data; for now those details have to be a trade secret.
The crux of pandas is the dataframe and this week I discoverd that you can load your query results right into a dataframe and not have to write a parser. Notice in the below code that I run the query and place the results into a dataframe with a ‘fetchall()’. This really saves me a lot of headaches when passing around chunks of data and makes it easy to write functions to combine data with overlapping values. Return the dataframe and do more ::work::.
Here’s a query that I use to get NPC kills by region when given a regionID.
def getnpckills_byregion(regionID): conn = psycopg2.connect(conn_string) cursor = conn.cursor() sql = """SELECT "mapSolarSystems"."regionID", "mapRegions"."regionName", SUM(mapkills."factionKills") as SUM_factionKills, mapkills."timestamp" FROM data.mapkills, public."mapRegions", public."mapSolarSystems" WHERE "mapRegions"."regionID" = "mapSolarSystems"."regionID" AND "mapSolarSystems"."solarSystemID" = mapkills."solarSystemID" AND public."mapSolarSystems"."regionID" = %s GROUP BY mapkills."timestamp", public."mapSolarSystems"."regionID", "mapRegions"."regionName" ORDER BY timestamp DESC """ data = (regionID, ) cursor.execute(sql, data) df = pd.DataFrame(cursor.fetchall(),columns=['regionID', 'regionName', 'SUM_factionKills', 'timestamp']) cursor.close() return df
Doing a pivot in pandas is amazingly elegant and you can perform the operation in one statement.
df.pivot(index='timestamp', columns='regionName', values='SUM_factionKills')
Another powerful tool in the library are the resampling options. I’m still trying adjusting the parameters to find a balance on data granularity.
The below graphs show NPC kills over all regions over time. One graph has no resampling while the other has values that I feel display a daily value, normalizing the peaks and valleys that occur due to server downtime and differing activity rates due to timezone.
conversion = "12H" v_window = 8 v_minperiods = 1 pd.rolling_mean(df.resample(conversion, fill_method='bfill'), window=v_window, min_periods=v_minperiods)
Ratting activity in Blood Raiders regions:
Ratting activity in Serpentis regions:
More to come. Find me on #k162space on tweetfleet.slack.com.
I published the initial code to GitHub with install directions for Ubuntu. The project so far is a Python script that consumes map API jump and kill activity and inserts them into a PostgreSQL database. Details as to why I picked these two items as starting points to come later.
So far this has been a fun learning project; I’m getting more familiar with Python datatypes/iterators/quirks, have been leveraging arrow for any datetime work, am getting more comfortable with pandas, and have a handle on the basic differences between MySQL and PostgreSQL.
Growing Laundry List
- Implement more efficient UPSERT to improve database insertion efficiency.
- Figure out what CREST market data to record.
- Start working on putting the pieces together for the first analysis — more on that later.
Presenting data with Python Pandas.
Manually extracting the data to a csv and charting it in D3.js for now.
The past few months have been a series of shifting goals and commitments for me, both personally and professionally. I’ve spent more time outside running trails in the Marin Headlands and I changed jobs, landing a engineering role at a trading firm.
The winter months in San Francisco mean rain, and as such, I have been spending a lot of time reading Python tutorials, collaborating with James and Greg at element43, and working on a list of design requirements for the spotmarket trading platform.
The biggest addition to the project has been the buildout of a proper homelab server running ESXi. I’ve wanted to build out a homelab since I’m a network engineer and often tinker, but I’ve never really had a driver for the project. For the odd virtual machine or instance of a virtual switch/router that I’ve needed, I’ve leveraged an install of Oracle VirtualBox on my desktop. Now that I want to run development, production, and other testing machines, I built out a server.
Server ‘esx-01’, Uninspired Name
I ended up running ESXi 5.5 because I struggled to get my ‘unsupported’ NIC to work in 6.0. I followed a few guides to load drivers into the installation ISO, but was unable to get the setup procedure to complete properly for 6.0. My needs are basic on the VMware end so I’m sticking with 5.5 as I was able to get that to install with Realtek 8168 onboard NIC.
Details for the build:
- ASUS M5A99FX PRO R2.0 Motherboard
- AMD FX-8320 Vishera 8-Core 3.5 GHz CPU
- SanDisk Ultra II 2.5″ 480GB SSD
- G.SKILL Ripjaws X Series 16GB (2 x 8GB)
I was in a computer engineering program for two years in college before switching over and finishing in a telecommunications program. During that time I was exposed to assembly, C, and C++. I spent many late nights struggling with those languages, so I was apprehensive to dive back into programming.
I picked Python as the main language for this project and so far it has been outstanding. I’m getting more comfortable with the Python datatypes, iterators, and reliance on indentation. For data manipulation I’m using pandas, which for anyone that loves rows and columned data, it is second nature.
Another item that has taken a lot of brain time has been the switch from MySQL to PostgreSQL. I’ve had to learn about the use of schemas, differences in data types, how to enable TCP access, and migrate my administration/query generator tool over to pgAdmin.
If you have any comments or want to chat with me, find me on the new #k192space channel at tweetfleet.slack.com. Upcoming posts will delve into coding and talk about the reporting that I’ve started to build.
If you have followed the musings of this blog over the past two or three years, you are probably familiar with our Wallet Manger project. Growing out of the limitations of comparing market price exports from the client in Excel, James and I started a homebrew project in a framework called Yii to help us manage our market trades and have more visibility on item performance.
It wasn’t polished, lacked security features, had many missing features, and was more of a proof of concept than a properly managed project; it was a path to flex our IT muscles and learn XML, JSON, PHP, MySQL, database Indexes, and version control.
2015 ended on a high note for me as I landed a position at a quantitative trading firm, with the challenges of coding the Wallet Manager and stories about working with Eve 3rd party developers broached during the interview process.
2016 will be a year of rekindled interest in working with Eve data again for both me and my partner James. Want to refresh our homebrew project using an updated set of tools and change the focus.
There are many projects out there that record historical data from the API and present it to you. We want to go a step further and build an analysis platform that can incorporate data from the API and use it to create relationships to fuel predictions. We’re currently picking our tools and formulating requirements so be sure to watch for updates as we develop our platform.
If you want to join the discussion, head on over to #k162space on tweetfleet.slack.com.
After months of waiting and watching, I finally cashed out on the pile of Gecko heavy drones.
I bought 326 at an average price of 50.05 M ISK. I cashed out after 219 days for a net profit of 8.83 B. The reason that I exited the market on this item was because of how the daily profit per days invested chart was trending. I missed the bubble on the market and had to settle for a average of 40.51 M ISK/day invested. The time to sell would have been before Eve Vegas.
Producing capitals still averaged out to more money on average at 50.27 M ISK/day; details on that project can be found here.
During the Capital Ship session at Eve Vegas we learned that Capital ships will no longer be able to use standard drones. This announcement sent the Gecko market into a period of high volatility given the speculative rarity of the item. Looking at the charts, it seems that the volume and return amounts are trending towards less volatility and a price around 90M.
The blue bars represent numbers that are based on finalized price data while the yellow bars represent an estimated value based on current trends.
If you have a stockpile, I would not sell at the moment. Hold on to that investment.
If you have never attended a player gathering, it is an opportunity to witness the sheer passion, altruistic spirit, and friendless of the Eve community. Along with the playerbase, many of your favorite Devs are walking around, giving you the opportunity to have a face-to-face interaction with them. I had the oppournity to personally thank CCP Seagull for her passion and dedication on delivering a renewed vision of Eve Online. Many thanks to my partner in crime for the weekend, Lockefox (@HLIBIndustry).
Witnessing Vegas first hand.
Abusing the Planet Hollywood restaurant twitter hashtag.
Front row seats for new Capital reveal.
Watching some PvP action and bidding on server blades.
Seeing the Floating Nyx model and crowds.
Cheering for Structures with CCP Nullarbor.
Sharing a drink with @CCPQuant, talking statistics, and learning Icelandic for modern inventions like ‘refrigerator’ and ‘computer’.
Witnessing the power of an open bar.
Being a guest on