[DAISY] Documentation (v0.5) >> Working with DAISY Demo: Evidence, Notebooks and RAM Configurations

Working with DAISY Demo: Evidence, Notebooks and RAM Configurations

Some days ago, we released the new Data Science and Artificial Intelligence for DFIR Virtual Machine, DAISY, including precooked outputs from different tools and Jupyter Notebooks to apply DS on them. In this documentation page we will explain what is included and try to answer some of the main questions we may have when working with DAISY: What amount of RAM do I need to work with DAISY? What is the minimum required to start playing and learning the basics?

If you only want to know what you can run with different configurations, take a look at demo section

How is Daisy configured?

First of all, it is important that you know we have 2 different versions of DAISY in the downloads section. The demo version brings some precooked evidence and notebooks for you and it is perfect to learn and play, while the production version is completely clean and a better option for your production environment. Here we will be explaining the demo version of DAISY, which contains outputs from the following evidence:

One disk of the following cases:

These evidence have been used to get a precooked output/notebook file for the following tools:

plaso (it has a second output/notebook for evtx)
Volatility
KAPE
Autoruns
fls/mactime

All these precooked outputs can be found under /mnt/data/Precooked/. Here you will find a folder for each case, containing different outputs ready to be used by the precreated Jupyter Notebooks and other tools:

Ah2polivio (Ali Hadi case): contains four files used for fls demo Jupyter notebook. This notebook will create one dataframe per file
Musctf19:
- mus-ctf-19-desktop-001.plaso.json: full plaso output that cannot be loaded with 8GB of RAM
- mus-ctf-19-desktop-001_reduced.plaso.json: reduced version of the previous file, created to be used with 8GB of RAM
- mus-ctf-19-desktop-001.timeline.csv: file used for the fstl notebook
Szechuan:
- vol*: files from volatility used for the volatility notebook
- kape: folder containing outputs for the kape notebook
- szechuan_desktop_plaso_evtx_log2timeline.json: file containing only evtx extracted with plaso for the plaso-evtx notebook
- szechuan_dc01_plaso_log2timeline_reduced.csv: file created to be used with TimeSketch and 8GB of RAM
- autorunsc-citadel-dc01.csv: file used for the autoruns notebook

The precooked notebooks can be found under /opt/ds4n6/anaconda3/Notebooks/. Here we will find two folders:

Demo: these notebooks are ready to be run with the precooked evidence in the Demo DAISY. All of them are prepared to work with 8GB of RAM, so you only have to run all the cells and start playing with the outputs. They are perfect to learn about DS and the ds4n6_lib
Templates: here you will find the same notebooks you have in the Demo folder, but they don't contain any demo information and the are ready to load your data in the easiest way. You can find this folder in the Demo and Production versions of DAISY, and the notebooks are perfect for your production environment and investigations

In both versions (production and demo), DAISY is preconfigured with 8GB of RAM. This is not an arbitrary decision, as this is the minimum recommended to load all the precooked evidence and play with the Data Science, and also this is the minimum required by other tools such as TimeSketch. Moreover, 8GB of RAM are not enough to load the outputs of some tools, such as the plaso ones when running all the parsers in a whole evidence, so, in these cases, DAISY includes a reduced version of the output files to be used with 8GB. In the same way, all the demo notebooks are ready to load the reduced evidence if necessary.

Does it mean we cannot use DAISY if we have less than 8GB of RAM available for the VM? Absolutely not, it only means you won't be able to run all the tools and prepared notebooks. Let's take a look to the demo evidence.

Running analysis in demo evidence

Precooked notebooks

The good news is that almost all of the precooked outputs can be used with only 4GB of RAM. So, actually, this is the minimum recommended to learn Data Science and play with DAISY. It is important to remember that, if you don't shutdown the kernels of the notebooks you run, the available RAM will decrease and you won't be able to run all the notebooks, so it is a good practice to shutdown/restart the kernel when you stop working with a notebook.

Here you have a table with information of processing the evidence with 4GB of RAM. The processing time of each notebook will be about one minute for a laptop with the DAISY standard configuration:

Notebook Template	Works	Evidence Case	Tool	Output Size
plaso-evtx	X	Szechuan	plaso	16MB
volatility	X	Szechuan	volatility	101MB
kape	X	Szechuan	kape	35MB
autoruns	X	Szechuan	autoruns	1.1MB
fls	X	Ah2-polivio	fls	104MB
mactime	X	Musctf19	mactime	184MB
plaso (reduced output)		Musctf19	plaso	489MB
plaso		Musctf19	plaso	2.6GB

So, what about plaso? As you can see in the table above, the plaso output size is 2.6GB (remember we got the outputs from running all the parsers in the whole disk), so we have created a reduced version of 489MB that can be used with 8GB of RAM. If you want to use the full plaso file, you will need 14GB of RAM in your VM.

As a final tip for the notebooks, with 8GB of RAM you can run all the notebooks, except for the full plaso one, without any kernel restart.

TimeSketch

We also have a precooked file for TimeSketch, so you can create your demo investigation timeline to try TimeSketch and Picatrix functions. The file can be found in the Demo version in: /mnt/data/Precooked/Szechuan/szechuan_dc01_plaso_log2timeline_reduced.csv

As you can see, 8GB of RAM is not enough to load a csv file after running all the Plaso parsers in the whole disk, so we have created a reduced version of the Plaso output. With this file and 8GB of RAM, you can create the investigation, run analysis and use all the TimeSketch features.