Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Ofey committed Nov 28, 2019
1 parent 8200a69 commit 615eaf2
Show file tree
Hide file tree
Showing 2 changed files with 90 additions and 2 deletions.
90 changes: 89 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,93 @@
# BERT4Rec


## Usage

TBD
**Requirements**

* python 2.7+
* Tensorflow 1.12 (GPU version)
* CUDA compatible with TF 1.12

**Run**

For simplicity, here we take ml-1m as an example:

``` bash
./run_ml-1m.sh
```
include two part command:
generated masked training data
``` bash
python -u gen_data_fin.py \
--dataset_name=${dataset_name} \
--max_seq_length=${max_seq_length} \
--max_predictions_per_seq=${max_predictions_per_seq} \
--mask_prob=${mask_prob} \
--dupe_factor=${dupe_factor} \
--masked_lm_prob=${masked_lm_prob} \
--prop_sliding_window=${prop_sliding_window} \
--signature=${signature} \
--pool_size=${pool_size} \
```

train the model
``` bash
CUDA_VISIBLE_DEVICES=0 python -u run.py \
--train_input_file=./data/${dataset_name}${signature}.train.tfrecord \
--test_input_file=./data/${dataset_name}${signature}.test.tfrecord \
--vocab_filename=./data/${dataset_name}${signature}.vocab \
--user_history_filename=./data/${dataset_name}${signature}.his \
--checkpointDir=${CKPT_DIR}/${dataset_name} \
--signature=${signature}-${dim} \
--do_train=True \
--do_eval=True \
--bert_config_file=./bert_train/bert_config_${dataset_name}_${dim}.json \
--batch_size=${batch_size} \
--max_seq_length=${max_seq_length} \
--max_predictions_per_seq=${max_predictions_per_seq} \
--num_train_steps=${num_train_steps} \
--num_warmup_steps=100 \
--learning_rate=1e-4
```

### hyper-parameter settings
json in `bert_train` like `bert_config_ml-1m_64.json`

```json
{
"attention_probs_dropout_prob": 0.2,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.2,
"hidden_size": 64,
"initializer_range": 0.02,
"intermediate_size": 256,
"max_position_embeddings": 200,
"num_attention_heads": 2,
"num_hidden_layers": 2,
"type_vocab_size": 2,
"vocab_size": 3420
}
```


## Reference

```TeX
@inproceedings{Sun:2019:BSR:3357384.3357895,
author = {Sun, Fei and Liu, Jun and Wu, Jian and Pei, Changhua and Lin, Xiao and Ou, Wenwu and Jiang, Peng},
title = {BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer},
booktitle = {Proceedings of the 28th ACM International Conference on Information and Knowledge Management},
series = {CIKM '19},
year = {2019},
isbn = {978-1-4503-6976-3},
location = {Beijing, China},
pages = {1441--1450},
numpages = {10},
url = {http://doi.acm.org/10.1145/3357384.3357895},
doi = {10.1145/3357384.3357895},
acmid = {3357895},
publisher = {ACM},
address = {New York, NY, USA}
}
```
2 changes: 1 addition & 1 deletion run_beauty.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ max_seq_length=50
max_predictions_per_seq=30
masked_lm_prob=0.6

dim=16
dim=64
batch_size=256
num_train_steps=400000

Expand Down

0 comments on commit 615eaf2

Please sign in to comment.