Welcome to part 4 of the TensorFlow Object Detection API tutorial series. In this part of the tutorial, we're going to cover how to create the TFRecord files that we need to train an object detection model.
At this point, you should have an images
directory, inside of that has all of your images, along with 2 more diretories: train
and test
. Inside the test
directory should be a copy of ~10% of your images with their XML annotation data, and then the training
directory should have a copy of the rest. If you do not have this, go to the previous tutorial.
Now we need to convert these XML files to singular CSV files that can be then converted to the TFRecord files. To do this, I am going to make use of some of the code from datitran's github, with some minor changes. To begin, we're going to use xml_to_csv.py. You can either clone his entire directory or just grab the files, we'll be using two of them. Since his repository has changed multiple breaking times since I've been messing with it, I will note that the exact commit that I've been playing with is: here. If either of these two scripts aren't working for you, try pulling from the same commit as me. Definitely try his latest versions though. For example, at the time of my writing this, he has just updated for multiple box labels in images, which is obviously a very useful improvement.
Within the xml_to_csv
script, I changed:
def main(): image_path = os.path.join(os.getcwd(), 'annotations') xml_df = xml_to_csv(image_path) xml_df.to_csv('raccoon_labels.csv', index=None) print('Successfully converted xml to csv.')
To:
def main(): for directory in ['train','test']: image_path = os.path.join(os.getcwd(), 'images/{}'.format(directory)) xml_df = xml_to_csv(image_path) xml_df.to_csv('data/{}_labels.csv'.format(directory), index=None) print('Successfully converted xml to csv.')
This just handles for the train/test split and naming the files something useful. Go ahead and make a data
directory, and run this to create the two files. Next, create a training
directory from within the main Object-Detection
dir. At this point, you should have the following structure, and it is on my Desktop:
Object-Detection -data/ --test_labels.csv --train_labels.csv -images/ --test/ ---testingimages.jpg --train/ ---testingimages.jpg --...yourimages.jpg -training -xml_to_csv.py
Now, grab generate_tfrecord.py. The only modification that you will need to make here is in the class_text_to_int
function. You need to change this to your specific class. In our case, we just have ONE class. If you had many classes, then you would need to keep building out this if statement.
# TO-DO replace this with label map def class_text_to_int(row_label): if row_label == 'macncheese': return 1 else: None
Judging by that to-do, this function may change quite a bit in the future, so, again, use your intuition to modify the latest version, or go to the same commit that I am using.
Next, in order to use this, we need to either be running from within the models directory of the cloned models github, or we can more formally install the object detection API.
I am doing this tutorial on a fresh machine to be certain I don't miss any steps, so I will be fully setting up the Object API. If you've already cloned and setup, feel free to skip the initial steps and pick back up on the setup.py
part!
First, I am cloning the repository to my desktop:
git clone https://github.com/tensorflow/models.git
Then, following the installation instructions:
sudo apt-get install protobuf-compiler python-pil python-lxml sudo pip install jupyter sudo pip install matplotlib
And then:
# From tensorflow/models/ protoc object_detection/protos/*.proto --python_out=.
If you get an error on the protoc
command on Ubuntu, check the version you are running with protoc --version
, if it's not the latest version, you might want to update. As of my writing of this, we're using 3.4.0. In order to update or get protoc, head to the protoc releases page. Download the python version, extract, navigate into the directory and then do:
sudo ./configure sudo make check sudo make install
After that, try the protoc
command again (again, make sure you are issuing this from the models
dir).
and
# From tensorflow/models/ export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
Finally, let's install the object_dection library formally by doing the following from within the models
directory:
sudo python3 setup.py install
Now we can run the generate_tfrecord.py
script. We will run it twice, once for the train TFRecord and once for the test TFRecord.
python3 generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=data/train.record
python3 generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=data/test.record
Update: As of Jan 12 2019, one of my viewers pointed out the above commands now require an additional flag: --image_dir
. So, instead, you should do:
python3 generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=data/train.record --image_dir=images/
python3 generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=data/test.record --image_dir=images/
Now, in your data directory, you should have train.record
and test.record
.
Next up, we need to setup a configuration file and then either train a new model or start from a checkpoint with a pre-trained model, which is what we'll be covering in the next tutorial.