6 年前 · 38350a57f6
--- a/docs/autotune.md
+++ b/docs/autotune.md
@@ -13,9 +13,18 @@ In order to activate hyperparameter optimization, we must provide a validation f
 
				 
			
 
				 For example, using the same data as our [tutorial example](/docs/en/supervised-tutorial.html#our-first-classifier), the autotune can be used in the following way:
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```sh
			
 
				 >> ./fasttext supervised -input cooking.train -output model_cooking -autotune-validation cooking.valid
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> import fasttext
			
 
				+>>> model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid')
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				+
			
 
				 
			
 
				 Then, fastText will search the hyperparameters that gives the best f1-score on `cooking.valid` file:
			
 
				 ```sh
			
@@ -23,18 +32,36 @@ Progress: 100.0% Trials:   27 Best score:  0.406763 ETA:   0h 0m 0s
 
				 ```
			
 
				 
			
 
				 Now we can test the obtained model with:
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```sh
			
 
				->> ./fasttext test model_cooking.bin data/cooking.valid
			
 
				+>> ./fasttext test model_cooking.bin cooking.valid
			
 
				 N       3000
			
 
				 P@1     0.666
			
 
				 R@1     0.288
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> model.test("cooking.valid")
			
 
				+(3000L, 0.666, 0.288)
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				+
			
 
				 
			
 
				 By default, the search will take 5 minutes. You can set the timeout in seconds with the `-autotune-duration` argument. For example, if you want to set the limit to 10 minutes:
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```sh
			
 
				 >> ./fasttext supervised -input cooking.train -output model_cooking -autotune-validation cooking.valid -autotune-duration 600
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> import fasttext
			
 
				+>>> model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid', autotuneDuration=600)
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				+
			
 
				 
			
 
				 While autotuning, fastText displays the best f1-score found so far. If we decide to stop the tuning before the time limit, we can send one `SIGINT` signal (via `CTLR-C` for example). FastText will then finish the current training, and retrain with the best parameters found so far.
			
 
				 
			
@@ -46,23 +73,42 @@ As you may know, fastText can compress the model with [quantization](/docs/en/ch
 
				 
			
 
				 Fortunately, autotune can also find the hyperparameters for this compression task while targeting the desired model size. To this end, we can set the `-autotune-modelsize` argument:
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```sh
			
 
				 >> ./fasttext supervised -input cooking.train -output model_cooking -autotune-validation cooking.valid -autotune-modelsize 2M
			
 
				 ```
			
 
				-
			
 
				 This will produce a `.ftz` file with the best accuracy having the desired size:
			
 
				 ```sh
			
 
				 >> ls -la model_cooking.ftz
			
 
				 -rw-r--r--. 1 celebio users 1990862 Aug 25 05:39 model_cooking.ftz
			
 
				->> ./fasttext test model_cooking.ftz data/cooking.valid
			
 
				+>> ./fasttext test model_cooking.ftz cooking.valid
			
 
				 N       3000
			
 
				 P@1     0.57
			
 
				 R@1     0.246
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> import fasttext
			
 
				+>>> model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid', autotuneModelSize="2M")
			
 
				+```
			
 
				+If you save the model, you will obtain a model file with the desired size:
			
 
				+```py
			
 
				+>>> model.save_model("model_cooking.ftz")
			
 
				+>>> import os
			
 
				+>>> os.stat("model_cooking.ftz").st_size
			
 
				+1990862
			
 
				+>>> model.test("cooking.valid")
			
 
				+(3000L, 0.57, 0.246)
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				 
			
 
				 
			
 
				 # How to set the optimization metric?
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				+<br />
			
 
				 By default, autotune will test the validation file you provide, exactly the same way as `./fasttext test model_cooking.bin cooking.valid` and try to optimize to get the highest [f1-score](https://en.wikipedia.org/wiki/F1_score).
			
 
				 
			
 
				 But, if we want to optimize the score of a specific label, say `__label__baking`, we can set the `-autotune-metric` argument:
			
@@ -74,3 +120,19 @@ But, if we want to optimize the score of a specific label, say `__label__baking`
 
				 This is equivalent to manually optimize the f1-score we get when we test with `./fasttext test-label model_cooking.bin cooking.valid | grep __label__baking` in command line.
			
 
				 
			
 
				 Sometimes, you may be interested in predicting more than one label. For example, if you were optimizing the hyperparameters manually to get the best score to predict two labels, you would test with `./fasttext test model_cooking.bin cooking.valid 2`. You can also tell autotune to optimize the parameters by testing two labels with the `-autotune-predictions` argument.
			
 
				+<!--Python-->
			
 
				+<br />
			
 
				+By default, autotune will test the validation file you provide, exactly the same way as `model.test("cooking.valid")` and try to optimize to get the highest [f1-score](https://en.wikipedia.org/wiki/F1_score).
			
 
				+
			
 
				+But, if we want to optimize the score of a specific label, say `__label__baking`, we can set the `autotuneMetric` argument:
			
 
				+
			
 
				+```py
			
 
				+>>> import fasttext
			
 
				+>>> model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid', autotuneMetric="f1:__label__baking")
			
 
				+```
			
 
				+
			
 
				+This is equivalent to manually optimize the f1-score we get when we test with `model.test_label('cooking.valid')['__label__baking']`.
			
 
				+
			
 
				+Sometimes, you may be interested in predicting more than one label. For example, if you were optimizing the hyperparameters manually to get the best score to predict two labels, you would test with `model.test("cooking.valid", k=2)`. You can also tell autotune to optimize the parameters by testing two labels with the `autotunePredictions` argument.
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				+
			
--- a/docs/supervised-tutorial.md
+++ b/docs/supervised-tutorial.md
@@ -26,9 +26,15 @@ Move to the fastText directory and build it:
 
				 
			
 
				 ```bash
			
 
				 $ cd fastText-0.9.1
			
 
				+# for command line tool :
			
 
				 $ make
			
 
				+# for python bindings :
			
 
				+$ pip install .
			
 
				 ```
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				+<br />
			
 
				 Running the binary without any argument will print the high level documentation, showing the different use cases supported by fastText:
			
 
				 
			
 
				 ```bash
			
@@ -53,6 +59,62 @@ The commands supported by fasttext are:
 
				 
			
 
				 In this tutorial, we mainly use the `supervised`, `test` and `predict` subcommands, which corresponds to learning (and using) text classifier. For an introduction to the other functionalities of fastText, please see the [tutorial about learning word vectors](https://fasttext.cc/docs/en/unsupervised-tutorial.html).
			
 
				 
			
 
				+<!--Python-->
			
 
				+<br />
			
 
				+Calling the help function will show high level documentation of the library:
			
 
				+```py
			
 
				+>>> import fasttext
			
 
				+>>> help(fasttext.FastText)
			
 
				+Help on module fasttext.FastText in fasttext:
			
 
				+
			
 
				+NAME
			
 
				+    fasttext.FastText
			
 
				+
			
 
				+DESCRIPTION
			
 
				+    # Copyright (c) 2017-present, Facebook, Inc.
			
 
				+    # All rights reserved.
			
 
				+    #
			
 
				+    # This source code is licensed under the MIT license found in the
			
 
				+    # LICENSE file in the root directory of this source tree.
			
 
				+
			
 
				+FUNCTIONS
			
 
				+    load_model(path)
			
 
				+        Load a model given a filepath and return a model object.
			
 
				+    
			
 
				+    read_args(arg_list, arg_dict, arg_names, default_values)
			
 
				+    
			
 
				+    tokenize(text)
			
 
				+        Given a string of text, tokenize it and return a list of tokens
			
 
				+    
			
 
				+    train_supervised(*kargs, **kwargs)
			
 
				+        Train a supervised model and return a model object.
			
 
				+        
			
 
				+        input must be a filepath. The input text does not need to be tokenized
			
 
				+        as per the tokenize function, but it must be preprocessed and encoded
			
 
				+        as UTF-8. You might want to consult standard preprocessing scripts such
			
 
				+        as tokenizer.perl mentioned here: http://www.statmt.org/wmt07/baseline.html
			
 
				+        
			
 
				+        The input file must must contain at least one label per line. For an
			
 
				+        example consult the example datasets which are part of the fastText
			
 
				+        repository such as the dataset pulled by classification-example.sh.
			
 
				+    
			
 
				+    train_unsupervised(*kargs, **kwargs)
			
 
				+        Train an unsupervised model and return a model object.
			
 
				+        
			
 
				+        input must be a filepath. The input text does not need to be tokenized
			
 
				+        as per the tokenize function, but it must be preprocessed and encoded
			
 
				+        as UTF-8. You might want to consult standard preprocessing scripts such
			
 
				+        as tokenizer.perl mentioned here: http://www.statmt.org/wmt07/baseline.html
			
 
				+        
			
 
				+        The input field must not contain any labels or use the specified label prefix
			
 
				+        unless it is ok for those words to be ignored. For an example consult the
			
 
				+        dataset pulled by the example script word-vector-example.sh, which is
			
 
				+        part of the fastText repository.
			
 
				+```
			
 
				+
			
 
				+In this tutorial, we mainly use the `train_supervised`, which returns a model object, and call `test` and `predict` on this object. That corresponds to learning (and using) text classifier. For an introduction to the other functionalities of fastText, please see the [tutorial about learning word vectors](https://fasttext.cc/docs/en/unsupervised-tutorial.html).
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				+
			
 
				 ## Getting and preparing the data
			
 
				 
			
 
				 As mentioned in the introduction, we need labeled data to train our supervised classifier. In this tutorial, we are interested in building a classifier to automatically recognize the topic of a stackexchange question about cooking. Let's download examples of questions from [the cooking section of Stackexchange](http://cooking.stackexchange.com/), and their associated tags:
			
@@ -82,6 +144,8 @@ Our full dataset contains 15404 examples. Let's split it into a training set of
 
				 
			
 
				 We are now ready to train our first classifier:
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext supervised -input cooking.train -output model_cooking
			
 
				 Read 0M words
			
@@ -92,8 +156,27 @@ Progress: 100.0%  words/sec/thread: 75109  lr: 0.000000  loss: 5.708354  eta: 0h
 
				 
			
 
				 The `-input` command line option indicates the file containing the training examples, while the `-output` option indicates where to save the model. At the end of training, a file `model_cooking.bin`, containing the trained classifier, is created in the current directory.
			
 
				 
			
 
				-It is possible to directly test our classifier interactively, by running the command:
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> import fasttext
			
 
				+>>> model = fasttext.train_supervised(input="cooking.train")
			
 
				+Read 0M words
			
 
				+Number of words:  14598
			
 
				+Number of labels: 734
			
 
				+Progress: 100.0%  words/sec/thread: 75109  lr: 0.000000  loss: 5.708354  eta: 0h0m
			
 
				+```
			
 
				+The `input` argument indicates the file containing the training examples. We can now use the `model` variable to access information on the trained model.
			
 
				+
			
 
				+We can also call `save_model` to save it as a file and load it later with `load_model` function.
			
 
				+```py
			
 
				+>>> model.save_model("model_cooking.bin")
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				+
			
 
				 
			
 
				+Now, we can test our classifier, by :
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext predict model_cooking.bin -
			
 
				 ```
			
@@ -106,8 +189,27 @@ The predicted tag is `baking`  which fits well to this question. Let us now try
 
				 
			
 
				 *Why not put knives in the dishwasher?*
			
 
				 
			
 
				-The label predicted by the model is `food-safety`, which is not relevant. Somehow, the model seems to fail on simple examples. To get a better sense of its quality, let's test it on the validation data by running:
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> model.predict("Which baking dish is best to bake a banana bread ?")
			
 
				+((u'__label__baking',), array([0.15613931]))
			
 
				+```
			
 
				+The predicted tag is `baking`  which fits well to this question. Let us now try a second example:
			
 
				+
			
 
				+```py
			
 
				+>>> model.predict("Why not put knives in the dishwasher?")
			
 
				+((u'__label__food-safety',), array([0.08686075]))
			
 
				+```
			
 
				+
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				+
			
 
				+
			
 
				+The label predicted by the model is `food-safety`, which is not relevant. Somehow, the model seems to fail on simple examples.
			
 
				 
			
 
				+To get a better sense of its quality, let's test it on the validation data by running:
			
 
				+
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext test model_cooking.bin cooking.valid
			
 
				 N  3000
			
@@ -115,9 +217,20 @@ P@1  0.124
 
				 R@1  0.0541
			
 
				 Number of examples: 3000
			
 
				 ```
			
 
				+The output of fastText are the precision at one (`P@1`) and the recall at one (`R@1`).
			
 
				+
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> model.test("cooking.valid")
			
 
				+(3000L, 0.124, 0.0541)
			
 
				+```
			
 
				+The output are the number of samples (here `3000`), the precision at one (`0.124`) and the recall at one (`0.0541`).
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				 
			
 
				-The output of fastText are the precision at one (`P@1`) and the recall at one (`R@1`). We can also compute the precision at five and recall at five with:
			
 
				+We can also compute the precision at five and recall at five with:
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext test model_cooking.bin cooking.valid 5
			
 
				 N  3000
			
@@ -125,6 +238,13 @@ P@5  0.0668
 
				 R@5  0.146
			
 
				 Number of examples: 3000
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> model.test("cooking.valid", k=5)
			
 
				+(3000L, 0.0668, 0.146)
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				+
			
 
				 
			
 
				 ## Advanced readers: precision and recall
			
 
				 
			
@@ -134,9 +254,17 @@ The precision is the number of correct labels among the labels predicted by fast
 
				 
			
 
				 On Stack Exchange, this sentence is labeled with three tags: `equipment`, `cleaning` and `knives`. The top five labels predicted by the model can be obtained with:
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext predict model_cooking.bin - 5
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> model.predict("Why not put knives in the dishwasher?", k=5)
			
 
				+((u'__label__food-safety', u'__label__baking', u'__label__equipment', u'__label__substitutions', u'__label__bread'), array([0.0857 , 0.0657, 0.0454, 0.0333, 0.0333]))
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				 
			
 
				 are `food-safety`, `baking`, `equipment`, `substitutions` and `bread`.
			
 
				 
			
@@ -160,12 +288,14 @@ Looking at the data, we observe that some words contain uppercase letter or punc
 
				 
			
 
				 Let's train a new model on the pre-processed data:
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext supervised -input cooking.train -output model_cooking
			
 
				 Read 0M words
			
 
				 Number of words:  9012
			
 
				 Number of labels: 734
			
 
				-Progress: 100.0%  words/sec/thread: 82041  lr: 0.000000  loss: 5.671649  eta: 0h0m h-14m
			
 
				+Progress: 100.0%  words/sec/thread: 82041  lr: 0.000000  loss: 5.671649  eta: 0h0m
			
 
				 
			
 
				 >> ./fasttext test model_cooking.bin cooking.valid
			
 
				 N  3000
			
@@ -173,6 +303,19 @@ P@1  0.164
 
				 R@1  0.0717
			
 
				 Number of examples: 3000
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> import fasttext
			
 
				+>>> model = fasttext.train_supervised(input="cooking.train")
			
 
				+Read 0M words
			
 
				+Number of words:  9012
			
 
				+Number of labels: 734
			
 
				+Progress: 100.0%  words/sec/thread: 82041  lr: 0.000000  loss: 5.671649  eta: 0h0m
			
 
				+
			
 
				+>>> model.test("cooking.valid")
			
 
				+(3000L, 0.164, 0.0717)
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				 
			
 
				 We observe that thanks to the pre-processing, the vocabulary is smaller (from 14k words to 9k). The precision is also starting to go up by 4%!
			
 
				 
			
@@ -180,6 +323,8 @@ We observe that thanks to the pre-processing, the vocabulary is smaller (from 14
 
				 
			
 
				 By default, fastText sees each training example only five times during training, which is pretty small, given that our training set only have 12k training examples. The number of times each examples is seen (also known as the number of epochs), can be increased using the `-epoch` option:
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext supervised -input cooking.train -output model_cooking -epoch 25
			
 
				 Read 0M words
			
@@ -187,9 +332,21 @@ Number of words:  9012
 
				 Number of labels: 734
			
 
				 Progress: 100.0%  words/sec/thread: 77633  lr: 0.000000  loss: 7.147976  eta: 0h0m
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> import fasttext
			
 
				+>>> model = fasttext.train_supervised(input="cooking.train", epoch=25)
			
 
				+Read 0M words
			
 
				+Number of words:  9012
			
 
				+Number of labels: 734
			
 
				+Progress: 100.0%  words/sec/thread: 77633  lr: 0.000000  loss: 7.147976  eta: 0h0m
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				 
			
 
				 Let's test the new model:
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext test model_cooking.bin cooking.valid
			
 
				 N  3000
			
@@ -197,9 +354,17 @@ P@1  0.501
 
				 R@1  0.218
			
 
				 Number of examples: 3000
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> model.test("cooking.valid")
			
 
				+(3000L, 0.501, 0.218)
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				 
			
 
				 This is much better! Another way to change the learning speed of our model is to increase (or decrease) the learning rate of the algorithm. This corresponds to how much the model changes after processing each example. A learning rate of 0 would mean that the model does not change at all, and thus, does not learn anything. Good values of the learning rate are in the range `0.1 - 1.0`.
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext supervised -input cooking.train -output model_cooking -lr 1.0  
			
 
				 Read 0M words
			
@@ -213,9 +378,23 @@ P@1  0.563
 
				 R@1  0.245
			
 
				 Number of examples: 3000
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> model = fasttext.train_supervised(input="cooking.train", lr=1.0)
			
 
				+Read 0M words
			
 
				+Number of words:  9012
			
 
				+Number of labels: 734
			
 
				+Progress: 100.0%  words/sec/thread: 81469  lr: 0.000000  loss: 6.405640  eta: 0h0m
			
 
				+
			
 
				+>>> model.test("cooking.valid")
			
 
				+(3000L, 0.563, 0.245)
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				 
			
 
				 Even better! Let's try both together:
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext supervised -input cooking.train -output model_cooking -lr 1.0 -epoch 25
			
 
				 Read 0M words
			
@@ -229,6 +408,18 @@ P@1  0.585
 
				 R@1  0.255
			
 
				 Number of examples: 3000
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> model = fasttext.train_supervised(input="cooking.train", lr=1.0, epoch=25)
			
 
				+Read 0M words
			
 
				+Number of words:  9012
			
 
				+Number of labels: 734
			
 
				+Progress: 100.0%  words/sec/thread: 76394  lr: 0.000000  loss: 4.350277  eta: 0h0m
			
 
				+
			
 
				+>>> model.test("cooking.valid")
			
 
				+(3000L, 0.585, 0.255)
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				 
			
 
				 Let us now add a few more features to improve even further our performance!
			
 
				 
			
@@ -236,6 +427,8 @@ Let us now add a few more features to improve even further our performance!
 
				 
			
 
				 Finally, we can improve the performance of a model by using word bigrams, instead of just unigrams. This is especially important for classification problems where word order is important, such as sentiment analysis.
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext supervised -input cooking.train -output model_cooking -lr 1.0 -epoch 25 -wordNgrams 2
			
 
				 Read 0M words
			
@@ -249,6 +442,18 @@ P@1  0.599
 
				 R@1  0.261
			
 
				 Number of examples: 3000
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> model = fasttext.train_supervised(input="cooking.train", lr=1.0, epoch=25, wordNgrams=2)
			
 
				+Read 0M words
			
 
				+Number of words:  9012
			
 
				+Number of labels: 734
			
 
				+Progress: 100.0%  words/sec/thread: 75366  lr: 0.000000  loss: 3.226064  eta: 0h0m
			
 
				+
			
 
				+>>> model.test("cooking.valid")
			
 
				+(3000L, 0.599, 0.261)
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				 
			
 
				 With a few steps, we were able to go from a precision at one of 12.4% to 59.9%. Important steps included:
			
 
				 
			
@@ -274,6 +479,8 @@ It is common to refer to a word as a unigram.
 
				 
			
 
				 Since we are training our model on a few thousands of examples, the training only takes a few seconds. But training models on larger datasets, with more labels can start to be too slow. A potential solution to make the training faster is to use the [hierarchical softmax](#advanced-readers-hierarchical-softmax), instead of the regular softmax. This can be done with the option `-loss hs`:
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext supervised -input cooking.train -output model_cooking -lr 1.0 -epoch 25 -wordNgrams 2 -bucket 200000 -dim 50 -loss hs
			
 
				 Read 0M words
			
@@ -281,6 +488,15 @@ Number of words:  9012
 
				 Number of labels: 734
			
 
				 Progress: 100.0%  words/sec/thread: 2199406  lr: 0.000000  loss: 1.718807  eta: 0h0m
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> model = fasttext.train_supervised(input="cooking.train", lr=1.0, epoch=25, wordNgrams=2, bucket=200000, dim=50, loss='hs')
			
 
				+Read 0M words
			
 
				+Number of words:  9012
			
 
				+Number of labels: 734
			
 
				+Progress: 100.0%  words/sec/thread: 2199406  lr: 0.000000  loss: 1.718807  eta: 0h0m
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				 
			
 
				 Training should now take less than a second.
			
 
				 
			
@@ -301,6 +517,8 @@ When we want to assign a document to multiple labels, we can still use the softm
 
				 
			
 
				 A convenient way to handle multiple labels is to use independent binary classifiers for each label. This can be done with `-loss one-vs-all` or `-loss ova`.
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext supervised -input cooking.train -output model_cooking -lr 0.5 -epoch 25 -wordNgrams 2 -bucket 200000 -dim 50 -loss one-vs-all
			
 
				 Read 0M words
			
@@ -308,10 +526,22 @@ Number of words:  14543
 
				 Number of labels: 735
			
 
				 Progress: 100.0% words/sec/thread:   72104 lr:  0.000000 loss:  4.340807 ETA:   0h 0m
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> import fasttext
			
 
				+>>> model = fasttext.train_supervised(input="cooking.train", lr=0.5, epoch=25, wordNgrams=2, bucket=200000, dim=50, loss='ova')
			
 
				+Read 0M words
			
 
				+Number of words:  14543
			
 
				+Number of labels: 735
			
 
				+Progress: 100.0% words/sec/thread:   72104 lr:  0.000000 loss:  4.340807 ETA:   0h 0m
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				 
			
 
				 It is a good idea to decrease the learning rate compared to other loss functions.
			
 
				 
			
 
				 Now let's have a look on our predictions, we want as many prediction as possible (argument `-1`) and we want only labels with probability higher or equal to `0.5` :
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				 ```bash
			
 
				 >> ./fasttext predict-prob model_cooking.bin - -1 0.5
			
 
				 ```
			
@@ -323,9 +553,19 @@ we get:
 
				 ```
			
 
				 __label__baking 1.00000 __label__bananas 0.939923 __label__bread 0.592677
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+```py
			
 
				+>>> model.predict("Which baking dish is best to bake a banana bread ?", k=-1, threshold=0.5)
			
 
				+((u''__label__baking, u'__label__bananas', u'__label__bread'), array([1.00000, 0.939923, 0.592677]))
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				+
			
 
				 
			
 
				-We can also evaluate our results with the `test` command :
			
 
				 
			
 
				+<!--DOCUSAURUS_CODE_TABS-->
			
 
				+<!--Command line-->
			
 
				+<br />
			
 
				+We can also evaluate our results with the `test` command :
			
 
				 ```bash
			
 
				 >> ./fasttext test model_cooking.bin cooking.valid -1 0.5
			
 
				 N 3000
			
@@ -333,7 +573,6 @@ P@-1  0.702
 
				 R@-1  0.2
			
 
				 Number of examples: 3000
			
 
				 ```
			
 
				-
			
 
				 and play with the threshold to obtain desired precision/recall metrics :
			
 
				 
			
 
				 ```bash
			
@@ -343,6 +582,15 @@ P@-1  0.591
 
				 R@-1  0.272
			
 
				 Number of examples: 3000
			
 
				 ```
			
 
				+<!--Python-->
			
 
				+<br />
			
 
				+We can also evaluate our results with the `test` function:
			
 
				+```py
			
 
				+>>> model.test("cooking.valid", k=-1)
			
 
				+(3000L, 0.702, 0.2)
			
 
				+```
			
 
				+<!--END_DOCUSAURUS_CODE_TABS-->
			
 
				+
			
 
				 
			
 
				 ## Conclusion
			
 
				 
			
--- a/website/siteConfig.js
+++ b/website/siteConfig.js
@@ -96,7 +96,10 @@ const siteConfig = {
 
				   /* remove this to disable google analytics tracking */
			
 
				   gaTrackingId: "UA-44373548-30",
			
 
				   ogImage: "img/ogimage.png",
			
 
				-  useEnglishUrl: true
			
 
				+  useEnglishUrl: true,
			
 
				+  scripts: [
			
 
				+    '/tabber.js',
			
 
				+  ],
			
 
				 };
			
 
				 
			
 
				 module.exports = siteConfig;
			
--- a/website/static/tabber.js
+++ b/website/static/tabber.js
@@ -0,0 +1,42 @@
 
				+function addLoadEvent(func) {
			
 
				+  var oldonload = window.onload;
			
 
				+  if (typeof window.onload != 'function') {
			
 
				+    window.onload = func;
			
 
				+  } else {
			
 
				+    window.onload = function() {
			
 
				+      if (oldonload) {
			
 
				+        oldonload();
			
 
				+      }
			
 
				+      func();
			
 
				+    }
			
 
				+  }
			
 
				+}
			
 
				+
			
 
				+
			
 
				+function tabber(){
			
 
				+    let navTabs = document.getElementsByClassName("nav-tabs");
			
 
				+    let selectAll = function(ind){
			
 
				+        for(let navTab of navTabs){
			
 
				+            let dom = navTab.childNodes[ind];
			
 
				+            let old = dom.onclick;
			
 
				+            dom.onclick = null;
			
 
				+            dom.click();
			
 
				+            dom.onclick = old;
			
 
				+        }
			
 
				+    }
			
 
				+    let registerAll = function(){
			
 
				+        for(let navTab of navTabs){
			
 
				+            let commandLineTab = navTab.childNodes[0];
			
 
				+            let pythonTab = navTab.childNodes[1];
			
 
				+            commandLineTab.onclick = function(){
			
 
				+                selectAll(0);
			
 
				+            }
			
 
				+            pythonTab.onclick = function(){
			
 
				+                selectAll(1);
			
 
				+            }
			
 
				+        }
			
 
				+    }
			
 
				+    registerAll();
			
 
				+};
			
 
				+
			
 
				+addLoadEvent(tabber);