Duo Tech Talk: OSXCollector - Automated Forensic Evidence Collection & Analysis for OS X
Duo Tech Talks is a monthly series of computer science and technology presentations hosted at Duo Security’s Ann Arbor downtown headquarters. Featuring experts from the local community and across the country, the talks range on a variety of topics including software engineering, hardware hacking, user experience design, cloud computing, programming, computer security and more. Sign up for future tech talks on the Duo Meetup page!
This February 2015’s Duo Tech Talk featured Ivan Leichtling of Yelp, the company behind the website and mobile app that publishes crowd-sourced reviews about local businesses. He leads security team focused on securing Yelp’s visitors, mobile apps, websites, employees and infrastructure. Prior to Yelp, Ivan lead teams, built hardware and wrote software at Microsoft.
Ivan lead a talk at Duo last Friday on OSXCollector, an open source forensic evidence collection and analysis toolkit for OS X developed in-house at Yelp. If you missed it, here’s a recording:
OSXCollector automates the digital forensic evidence collection and analysis that Yelp’s team of responders had been previously doing manually.
A typical corporate network has a lot of access points, which lends itself to a lot of opportunity for potential vulnerabilities, including suspicious DNS requests, callouts, alerts and malware.
This was in response to a large volume of Mac security alerts that they receive from their Yelp team, and the need to prevent OS X malware. Yelp spans 29 countries, with 73 million mobile users and 139 million monthly users, opening them up to a lot of potential vulnerabilities.
Alerts might come from host-based detectors that report on known malware infestations or new startup items, or network-based detectors that see potential C3 callouts or DNS requests to resolve suspicious domains, and sometimes they’re user-reported.
After the Yelp team receives an alert, their incident response team’s first goal is to figure out how to contain and then eradicate the threat. Then they move onto figuring out exactly what happened and how to prevent it in the future. One of the primary tools they use for root-causing OS X alerts is OSXCollector.
###Forensic Collection Ivan detailed how OSXCollector is built and how it runs - it’s a single Python file with zero dependencies to install on Mac OSX. It can be installed without using any support libraries that aren’t already available. The output is JSON, which is easy to manipulate.
OS X stores a lot of application data in SQLite databases. It’s really easy to get the contents of an SQLite database - just dump the tables, column names, make dictionaries of key values of every row in order to learn what the key operating system is doing.
Plist, or property list files, are also used in OS X, and are often used to store a user’s settings and configurations, including information about bundles and apps. OSXCollector uses Python Software Foundation to read plists - Foundation is a nice Objective-C wrapper.
OSXCollector collects a number of items, including kernel extensions, downloads, applications installed, OS system info, browser info, quarantines, email info, startup items, groups and accounts. Downloads and other files have common keys - path, hashes, timestamps, signature chains, plist info and more.
Timestamps are stored in a lot of different ways, and they’re important to digital forensics. They’re also often stored in five or six different ways. OSXCollector normalizes the timestamps to allow you to figure out what may have happened five seconds before a certain timestamp, which can help when investigating a security alert.
Ivan also covered quarantines, hashes, and signature chains.
###Forensic Analysis Ivan covered the different aspects of forensic analysis, starting with a manual analysis with grep and jq - you can grep a time window to see user activity, including downloads, or only URLs in a time window or a single user.
OSXCollector has a full framework for automated analysis with output filters. That means all data flows into a filter that may alter the data, then passes it along to the next filter.
One example is the domains filter, which finds every domain in the data output and adds a key value to the output, and puts the domain in the output. A blacklist filter also matches any key and determines if a certain visited domain is blacklisted for malware.
Ivan also discussed other filters such as openDNS related domains filter, openDNS domain reputation filter and the virustotal hash lookup filter.
Ivan Leichtling, Yelp
Ivan Leichtling leads an amazing team of engineers focused on securing Yelp's visitors, mobile apps, websites, employees, and infrastructure. Ivan holds a BS in Computer Science from the Columbia University School of Engineering and Applied Sciences. Prior to Yelp, Ivan spent a dozen years writing software, building hardware, and leading teams at Microsoft. Ivan is an anagram of vain and as such appreciates Twitter followers at @c0wl and @YelpEngineering.
###The Best of Bug Finding If you’re in the Ann Arbor area, sign up or just show up to the next Duo Tech Talk slated for March 6, 2015 @6pm EST. Dr. Charles Miller of the Application Security Team at Twitter will be presenting on different security vulnerabilities and some of his favorite bugs/exploits of his career. Check out The Best of Bug Finding.