TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

Last update: Jan 04, 2023

Overview

TensorFlow Ranking

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform. It contains the following components:

Commonly used loss functions including pointwise, pairwise, and listwise losses.
Commonly used ranking metrics like Mean Reciprocal Rank (MRR) and Normalized Discounted Cumulative Gain (NDCG).
Multi-item (also known as groupwise) scoring functions.
LambdaLoss implementation for direct ranking metric optimization.
Unbiased Learning-to-Rank from biased feedback data.

We envision that this library will provide a convenient open platform for hosting and advancing state-of-the-art ranking models based on deep learning techniques, and thus facilitate both academic research and industrial applications.

Tutorial Slides

TF-Ranking was presented at premier conferences in Information Retrieval, SIGIR 2019 and ICTIR 2019! The slides are available here.

Demos

We provide a demo, with no installation required, to get started on using TF-Ranking. This demo runs on a colaboratory notebook, an interactive Python environment. Using sparse features and embeddings in TF-Ranking . This demo demonstrates how to:

Use sparse/embedding features
Process data in TFRecord format
Tensorboard integration in colab notebook, for Estimator API

Also see Running Scripts for executable scripts.

Linux Installation

Stable Builds

To install the latest version from PyPI, run the following:

# Installing with the `--upgrade` flag ensures you'll get the latest version.
pip install --user --upgrade tensorflow_ranking

To force a Python 3-specific install, replace pip with pip3 in the above commands. For additional installation help, guidance installing prerequisites, and (optionally) setting up virtual environments, see the TensorFlow installation guide.

Note: Since TensorFlow is now included as a dependency of the TensorFlow Ranking package (in setup.py). If you wish to use different versions of TensorFlow (e.g., tensorflow-gpu), you may need to uninstall the existing verison and then install your desired version:

$ pip uninstall tensorflow
$ pip install tensorflow-gpu

Installing from Source

To build TensorFlow Ranking locally, you will need to install:
- Bazel, an open source build tool.
```
$ sudo apt-get update && sudo apt-get install bazel
```
- Pip, a Python package manager.
```
$ sudo apt-get install python-pip
```
- VirtualEnv, a tool to create isolated Python environments.
```
$ pip install --user virtualenv
```

Clone the TensorFlow Ranking repository.

$ git clone https://github.com/tensorflow/ranking.git

Build TensorFlow Ranking wheel file and store them in /tmp/ranking_pip folder.

$ cd ranking  # The folder which was cloned in Step 2.
$ bazel build //tensorflow_ranking/tools/pip_package:build_pip_package
$ bazel-bin/tensorflow_ranking/tools/pip_package/build_pip_package /tmp/ranking_pip

Install the wheel package using pip. Test in virtualenv, to avoid clash with any system dependencies.

$ ~/.local/bin/virtualenv -p python3 /tmp/tfr
$ source /tmp/tfr/bin/activate
(tfr) $ pip install /tmp/ranking_pip/tensorflow_ranking*.whl

In some cases, you may want to install a specific version of tensorflow, e.g., tensorflow-gpu or tensorflow==2.0.0. To do so you can either

(tfr) $ pip uninstall tensorflow
(tfr) $ pip install tensorflow==2.0.0

(tfr) $ pip uninstall tensorflow
(tfr) $ pip install tensorflow-gpu

Run all TensorFlow Ranking tests.

(tfr) $ bazel test //tensorflow_ranking/...

Invoke TensorFlow Ranking package in python (within virtualenv).
```
(tfr) $ python -c "import tensorflow_ranking"
```

Running Scripts

For ease of experimentation, we also provide a TFRecord example and a LIBSVM example in the form of executable scripts. This is particularly useful for hyperparameter tuning, where the hyperparameters are supplied as flags to the script.

TFRecord Example

Set up the data and directory.

MODEL_DIR=/tmp/tf_record_model && \
TRAIN=tensorflow_ranking/examples/data/train_elwc.tfrecord && \
EVAL=tensorflow_ranking/examples/data/eval_elwc.tfrecord && \
VOCAB=tensorflow_ranking/examples/data/vocab.txt

Build and run.

rm -rf $MODEL_DIR && \
bazel build -c opt \
tensorflow_ranking/examples/tf_ranking_tfrecord_py_binary && \
./bazel-bin/tensorflow_ranking/examples/tf_ranking_tfrecord_py_binary \
--train_path=$TRAIN \
--eval_path=$EVAL \
--vocab_path=$VOCAB \
--model_dir=$MODEL_DIR \
--data_format=example_list_with_context

LIBSVM Example

Set up the data and directory.

OUTPUT_DIR=/tmp/libsvm && \
TRAIN=tensorflow_ranking/examples/data/train.txt && \
VALI=tensorflow_ranking/examples/data/vali.txt && \
TEST=tensorflow_ranking/examples/data/test.txt

Build and run.

rm -rf $OUTPUT_DIR && \
bazel build -c opt \
tensorflow_ranking/examples/tf_ranking_libsvm_py_binary && \
./bazel-bin/tensorflow_ranking/examples/tf_ranking_libsvm_py_binary \
--train_path=$TRAIN \
--vali_path=$VALI \
--test_path=$TEST \
--output_dir=$OUTPUT_DIR \
--num_features=136 \
--num_train_steps=100

TensorBoard

The training results such as loss and metrics can be visualized using Tensorboard.

(Optional) If you are working on remote server, set up port forwarding with this command.
```
$ ssh <remote-server> -L 8888:127.0.0.1:8888
```

Install Tensorboard and invoke it with the following commands.

(tfr) $ pip install tensorboard
(tfr) $ tensorboard --logdir $OUTPUT_DIR

Jupyter Notebook

An example jupyter notebook is available in tensorflow_ranking/examples/handling_sparse_features.ipynb.

To run this notebook, first follow the steps in installation to set up virtualenv environment with tensorflow_ranking package installed.
Install jupyter within virtualenv.
```
(tfr) $ pip install jupyter
```

Start a jupyter notebook instance on remote server.

(tfr) $ jupyter notebook tensorflow_ranking/examples/handling_sparse_features.ipynb \
        --NotebookApp.allow_origin='https://colab.research.google.com' \
        --port=8888

(Optional) If you are working on remote server, set up port forwarding with this command.
```
$ ssh <remote-server> -L 8888:127.0.0.1:8888
```
Running the notebook.
- Start jupyter notebook on your local machine at http://localhost:8888/ and browse to the ipython notebook.
- An alternative is to use colaboratory notebook via colab.research.google.com and open the notebook in the browser. Choose local runtime and link to port 8888.

References

Rama Kumar Pasumarthi, Sebastian Bruch, Xuanhui Wang, Cheng Li, Michael Bendersky, Marc Najork, Jan Pfeifer, Nadav Golbandi, Rohan Anil, Stephan Wolf. TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank. KDD 2019.
Qingyao Ai, Xuanhui Wang, Sebastian Bruch, Nadav Golbandi, Michael Bendersky, Marc Najork. Learning Groupwise Scoring Functions Using Deep Neural Networks. ICTIR 2019
Xuanhui Wang, Michael Bendersky, Donald Metzler, and Marc Najork. Learning to Rank with Selection Bias in Personal Search. SIGIR 2016.
Xuanhui Wang, Cheng Li, Nadav Golbandi, Mike Bendersky, Marc Najork. The LambdaLoss Framework for Ranking Metric Optimization. CIKM 2018.

Citation

If you use TensorFlow Ranking in your research and would like to cite it, we suggest you use the following citation:

@inproceedings{TensorflowRankingKDD2019,
   author = {Rama Kumar Pasumarthi and Sebastian Bruch and Xuanhui Wang and Cheng Li and Michael Bendersky and Marc Najork and Jan Pfeifer and Nadav Golbandi and Rohan Anil and Stephan Wolf},
   title = {TF-Ranking: Scalable TensorFlow Library for Learning-to-Rank},
   booktitle = {Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining},
   year = {2019},
   pages = {2970--2978},
   location = {Anchorage, AK}
}

Comments

Parsing ELWC example to tensors consumable by a non-EIE model
versions:

tensorflow==2.3.2

tensorflow-ranking==0.3.0

description: When the example list only contains one item, the model graph can consume the tensors parsed by tfr's parser, however, when the number of items >=2, we got errors like this from tf-serving: the following example has an item list (example list) of 2.

ERROR: Code: InvalidArgument Message: Input to reshape is a tensor with 4 values, but the requested shape has 2 [[{{node dnn/input_from_feature_columns/input_layer/feature_A_indicator_1/Reshape}}]]

Here feature_A is an indicator column of bucket size ==2, default to 0. The model is DNNLinearCombinedEstimator.

By looking at the source code, I've noticed that ELWC parser is actually EIE parser, so I'm wondering if it's because the parser constructs the tensors to a EIE fashion, so that our model can not consume the model is expecting plain batched examples?

I've tried changing the parser source code to alter the output tensors, but no luck. Any idea? Thank you!
opened by edwardchu-studio 39

When using ELWC - OP_REQUIRES failed at example_parsing_ops.cc:91 : Invalid argument: Could not parse example input

Hello Team,

I trained a TF ranking model (basing my training on the following example: https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/tf_ranking_tfrecord.py) and saved it using estimator.export_saved_model('my_model', serving_input_receiver_fn), the model was trained successfully & saved without any warnings/errors.

I deployed the model to a local TensorFlow ModelServer and made an call to it over HTTP using cURL as described on https://www.tensorflow.org/tfx/serving/api_rest#request_format. Unfortunately I see the following error after making the request:

W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at example_parsing_ops.cc:91 : Invalid argument: Could not parse example input, value: '

ctx_f0

{ "error": "Could not parse example input, value: \'\n\035\n\021ctx_f0\022\010\022\006\n\004\000\000\340@\'\n\t [[{{node ParseExample/ParseExample}}]]" }

I understand that this is a problem that may be related to serialization where my input was not properly serialized, but saving the model by generating serving_input_receiver_fn & using it produced no errors/warnings, so I am not sure where to start looking to resolve this.

I am providing some details below, please let me know if you need more information.

Details

TF framework module versions

tensorflow-serving-api==2.0.0
tensorflow==2.0.0
tensorflow-ranking==0.2.0

Some training parameters and functions

_CONTEXT_FEATURES = {'ctx_f0'}
_DOCUMENT_FEATURES = {'f0', 'f1', 'f2'}
_DATA_FORMAT = tfr.data.ELWC
_PADDING_LABEL = -1

def example_feature_columns():
    spec = {}
    for f in _DOCUMENT_FEATURES:
        spec[f] = tf.feature_column.numeric_column(f, shape=(1,), default_value=_PADDING_LABEL, dtype=tf.float32)
    return spec

def context_feature_columns():
    spec = {}
    for f in _CONTEXT_FEATURES:
        spec[f] = tf.feature_column.numeric_column(f, shape=(1,), default_value=_PADDING_LABEL, dtype=tf.float32)
    return spec

Creating the serving_input_receiver_fn

context_feature_spec = tf.feature_column.make_parse_example_spec(context_feature_columns().values())
example_feature_spec = tf.feature_column.make_parse_example_spec(example_feature_columns().values())

serving_input_receiver_fn = tfr.data.build_ranking_serving_input_receiver_fn(
        data_format=_DATA_FORMAT,
        list_size=20,
        default_batch_size=None,
        receiver_name="input_ranking_data",
        context_feature_spec=context_feature_spec,
        example_feature_spec=example_feature_spec)

When making a REST API to a local TensorFlow ModelServer using the following cURL request

curl -H "Content-Type: application/json" \
-X POST http://192.168.99.100:8501/v1/models/my_model/versions/1587842143:regress \
-d '{"context": {"ctx_f0": 7.2}, "examples":[{"f0":[35.92],"f1":[5.258],"f2":[5.261]},{"f0":[82.337],"f1":[2.06],"f2":[2.068]}]}'

The error is as follows:

W external/org_tensorflow/tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at example_parsing_ops.cc:91 : Invalid argument: Could not parse example input, value: '

ctx_f0

{ "error": "Could not parse example input, value: \'\n\035\n\021ctx_f0\022\010\022\006\n\004\000\000\340@\'\n\t [[{{node ParseExample/ParseExample}}]]" }

opened by azagniotov 28

Model export

Could you please give a code example of how to export a model for TensorFlow Serving? No luck with estimator.export_saved_model or tf.estimator.BestExporter. I must be doing something wrong with feature_spec.

opened by nzhiltsov 21
Tensorflow_ranking with keras
Hi there,

I want to try Tensorflow_ranking with Keras. (https://github.com/tensorflow/ranking/tree/master/tensorflow_ranking/examples/keras)

As a first step I have made a copy/paste from your Github repo, but I didn't find the TEST/TRAIN/VAL dataset. I tried to use this: tensorflow_ranking/examples/data but it does not work.

My real goal to make a model that is able to rank Students in different classes based on their achievements. (Numeric and categorical features)

My questions are:

where could I find data to test code from tensorflow_ranking/examples/keras repo?

how can I use tensorflow_ranking (with Keras) when the dataset is grouped? (Class by Class)

I want to test your code with Python IDE (Spyder) so I didn't install Basel and other things.
opened by korosig 20
How long does sparse-model run for ?

HI guys,

I have been running the sparse-model for the last 5 days on GPU server and I can't see anything in my models directory. Meaning it did not get even to the first check point. Anyone has had experience with this?

My features are many though (About 900K) But I still expected to be past the first checkpoint.

Any hints would help here.

Thanks.

opened by mulangonando 17
how to predict?

Urgent! I'm unable to obtain predictions using ranker.predict(test). On printing the predictions it says <generator object EstimatorV2.predict at 0x7ff0c8de5c50.

opened by prakhar2811 17
Minimal example of prediction using TFR-BERT?

Would it be possible to have a minimal example that performs prediction (ranking) on an unseen set using TFR-BERT?

The current example ( tfrbert_example.py ) trains the model and evaluates performance on a development set, but it would be helpful to see a simple example of ranking on an unseen test set, and exporting these {query, document, rank} tuples to (for example) a plain text file for debugging.

opened by PeterAJansen 15
Non-deterministic results in tf_ranking_tfrecord.py
Hello,

When I run tf_ranking_tfrecord.py, I get each time different nDCG metrics.

I have already tried the following:

tf.random.set_seed(1)

tf.compat.v1.random.set_seed(1)

shuffle=False in _input_fn()

and I have not modified group_size=1

Is it possible to make the results deterministic? And, if so, how?

Thanks.
opened by davidmosca 14

Keras model couldn't save

platform: (CoLab) uname_result(system='Linux', node='1acc1ece6828', release='4.19.104+', version='#1 SMP Wed Feb 19 05:26:34 PST 2020', machine='x86_64', processor='x86_64')

python==3.6.9 tensorflow==2.1.0 tensorflow-ranking==0.3.0

TFRanking custom Keras layers can serialize and deserialize but the custom model failed saving to either saved_model or h5 format. Could there be any issues with get_config implementations?

_LABEL_FEATURE = "relevance"
_PADDING_LABEL = -1
_SIZE="example_list_size"

def create_feature_columns():
  sparse_column = tf.feature_column.categorical_column_with_hash_bucket(
      key="user_id", hash_bucket_size=100, dtype=tf.int64)
  query_embedding = tf.feature_column.embedding_column(
      categorical_column=sparse_column, dimension=20)
  context_feature_columns = {"user_id": query_embedding}

  sparse_column = tf.feature_column.categorical_column_with_hash_bucket(
      key="document_id", hash_bucket_size=100, dtype=tf.int64)
  document_embedding = tf.feature_column.embedding_column(
      categorical_column=sparse_column, dimension=20)
  example_feature_columns = {"document_id": document_embedding}

  return context_feature_columns, example_feature_columns

context_feature_columns, example_feature_columns = create_feature_columns()

# instantiate keras model
network = tfr.keras.canned.DNNRankingNetwork(
    context_feature_columns=context_feature_columns,
    example_feature_columns=example_feature_columns,
    hidden_layer_dims=[1024, 512, 256],
    activation=tf.nn.relu,
    dropout=0.5)
ranker = tfr.keras.model.create_keras_model(
    network=network,
    loss=tfr.keras.losses.get(tfr.losses.RankingLossKey.SOFTMAX_LOSS),
    metrics=tfr.keras.metrics.default_keras_metrics(),
    optimizer=tf.keras.optimizers.Adagrad(learning_rate=0.05),
    size_feature_name=_SIZE)

# save keras model to saved_model format
ranker.save('tmp')

INFO:tensorflow:Assets written to: tmp/assets
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-90-c6e3a44bcc2b> in <module>()
----> 1 ranker.save('tmp')

11 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/serialization.py in get_json_type(obj)
     70     return obj.__wrapped__
     71 
---> 72   raise TypeError('Not JSON Serializable:', obj)

TypeError: ('Not JSON Serializable:', tf.int64)

opened by yzhangswingman 14

ANTIQUE Dataset Tokenisation

Hi TFR Team,

I tried creating tf-records from raw ANTIQUE dataset, but I couldn't reproduce similar results. Accuracies are pretty low.

Could you please share with us on what kind of tokenisation you have used to create document and query tokens

Thanks.

opened by divyakyatam 13
Error reading the tfrecords EIE

Can someone help troubleshoot this error (I used the exact EIE converter provided in one of the issues here) :

InvalidArgumentError: 2 root error(s) found. (0) Invalid argument: Name: , Feature: serialized_context (data type: string) is required but could not be found. [[{{node ParseExample/ParseExample}}]] [[IteratorGetNext]] [[transform/encoding_layer/qery_tokens_embedding/hash_table_Lookup/LookupTableFindV2/_275]] (1) Invalid argument: Name: , Feature: serialized_context (data type: string) is required but could not be found. [[{{node ParseExample/ParseExample}}]] [[IteratorGetNext]] 0 successful operations. 0 derived errors ignored.

opened by mulangonando 13
is_label_valid in utils.py lacks support for integer targets
Hello. in function is_label_valid (line 76 of utils.py), the following code is given:

def is_label_valid(labels): """Returns a boolean `Tensor` for label validity.""" labels = tf.convert_to_tensor(value=labels) return tf.greater_equal(labels, 0.)

The result of this is an error if the target is an integer, because 0. is a float, and tf.greater_equal expects a both arguments to be of the same type. This prevents support for targets/labels that are integers.
opened by nmonette 0

Fix passing of keyword args to Dense layers in create_tower

Current behavior: kwargs are passed to tf.keras.Sequential.add, so they are not passed on to tf.keras.layers.Dense as intended. For example, when passing use_bias=False to create_tower with the kwarg name kernel_regularizer, it throws an exception:

Traceback (most recent call last):
  File "/Users/brussell/development/ranking/tensorflow_ranking/python/keras/layers_test.py", line 33, in test_create_tower_with_kwargs
    tower = layers.create_tower([3, 2, 1], 1, activation='relu', use_bias=False)
  File "/Users/brussell/development/ranking/tensorflow_ranking/python/keras/layers.py", line 70, in create_tower
    model.add(tf.keras.layers.Dense(units=layer_width), **kwargs)
  File "/usr/local/anaconda3/lib/python3.9/site-packages/tensorflow/python/trackable/base.py", line 205, in _method_wrapper
    result = method(self, *args, **kwargs)
  File "/usr/local/anaconda3/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 61, in error_handler
    return fn(*args, **kwargs)
TypeError: add() got an unexpected keyword argument 'use_bias' test_create_tower_with_kwargs

Fix: This PR fixes the behavior by shifting the closing paren of tf.keras.layers.Dense to the correct location.

opened by b4russell 1

package Bert

How do I pack Bert into my textual data? I have query and document pairs, should I package only documents? I ask because of this definition:

SEQ_LENGTH = 64
context_feature_spec = {}
example_feature_spec = {
    'input_word_ids': tf.io.FixedLenFeature(
        shape=(SEQ_LENGTH,), dtype=tf.int64,
        default_value=[7] * SEQ_LENGTH),
    'input_mask': tf.io.FixedLenFeature(
        shape=(SEQ_LENGTH,), dtype=tf.int64,
        default_value=[7] * SEQ_LENGTH),
    'input_type_ids': tf.io.FixedLenFeature(
        shape=(SEQ_LENGTH,), dtype=tf.int64,
        default_value=[7] * SEQ_LENGTH)}
label_spec = (
    "relevance",
    tf.io.FixedLenFeature(shape=(1,), dtype=tf.int64, default_value=-1)
)

Onde context_feature_spec = { }

The antique dataset already has the keys - input_ids, input_mask, relevance e segment_ids. How do I do this for my texts?

No model de ranking there is 'feature_name_mapping' which it shows what I should deliver and what the model expects.

opened by Tavares8 0

How to create tf records for a custom dataset and run TFR Ranking on it?

As the example is given in https://www.tensorflow.org/ranking/tutorials/ranking_dnn_distributed , it directly uses the TF Records and doesnt show how the data got converted into them. Even though as per format there should be just query, document and relevance in the tf record, but the code crashes everytime it runs. Please provide the features / format used in creating the tf records for Antique dataset/

opened by SubhayanDas08 0
Multi-MLP document representation

This is RFC on setting up multi-tower or multi-MLP document representation for TF Ranking.

If there are multiple logical groups of features, would be good to be able to create multi-MLP document representation.

For example groups such as: QU, CTR, Distances, Popularities, etc If each of those groups contains a few dozen features, it might be useful to have multi-MLP doc representation such as:

MLP(g1.[f1, f2, fn]) -> g1_h MLP(g2.[f1, f2, fn]) -> g2_h MLP(gN.[f1, f2, fn]) -> g3_h MLP(g1_h + g2_h + g3_h) -> doc_h

Has anyone attempted this with TF ranking?

CC: @ramakumar1729 @bendersky

opened by vitalyli 0

Releases(v0.5.1)

v0.5.1(Oct 26, 2022)
This is the 0.5.1 release of TensorFlow Ranking. We provide new ranking losses, metrics, layers, and pipeline based on the latest research progresses in Learning to Rank and Unbiased Ranking. We also update the API reference on www.tensorflow.org/ranking and on Github docs. The new changes include:

Ranking losses added in tfr.keras.losses:

PairwiseMSELoss: Implement a pairwise mean squared error loss.

OrdinalLoss: Implement a pointwise multi-head ordinal regression on ordered multilabel.

MixtureEMLoss: Implement a listwise Expectation-Maximization algorithm on a mixture model, introduced in Revisiting two tower models for unbiased learning to rank.

Lambda weights for Lambda losses added in tfr.keras.losses:

NDCGLambdaWeightV2: Implement an NDCG-based lambda weight for lambda losses, introduced in On Optimizing Top-K Metrics for Neural Ranking Models.

LabelDiffLambdaWeight: Implement a lambda weight based on the absolute difference of two labels.

Ranking metric added in tfr.keras.metrics:

HitsMetric: Implement [email protected] metric.

Ranking layer added in tfr.keras.layers:

Bilinear: A layer to implement a bilinear interaction of two vectors, used in Revisiting two tower models for unbiased learning to rank.

Ranking pipeline added in tfr.keras.pipeline:

MultiObjectivePipeline: A pipeline to apply multi-objective losses, used in Scale Calibration of Deep Ranking Models.

API reference updated on www.tensorflow.org/ranking and consistently on Github docs.

Dependencies: The following packages will be installed as required when installing tensorflow-ranking. tensorflow-serving-api>= 2.0.0, < 3.0.0 tensorflow>=2.7.0.
Source code(tar.gz)
Source code(zip)
v0.5.0(Nov 16, 2021)
This is the 0.5.0 release of TensorFlow Ranking. We provide a detailed overview, tutorial notebooks and API reference on www.tensorflow.org/ranking. The new changes are:

Move task.py and premade tfrbert_task.py to extension.

Remove RankingNetwork based tfr-bert example. The latest tfr-bert example using native Keras is available at tfrbert_antique_train.py.

Remove dependency on tf-models-official package to reduce install time. Users of tfr.ext.task or modules that depend on the above package will need to manually install it.

Updated all docstrings to be more detailed. Made several docstrings to be testable.

Add colab notebooks for quickstart tutorial and distributed ranking tutorial, also available on www.tensorflow.org/ranking.

Update strategy_utils to support parameter server strategy.

Add symmetric log1p to tfr.utils.

Remove references to Estimator/Feature Column related APIs in API reference.

Source code(tar.gz)
Source code(zip)
v0.4.2(Jul 22, 2021)
This is the 0.4.2 release of TensorFlow Ranking. The main changes are the TFR-BERT module based on the Orbit framework in tf-models, which facilitates users to write customized training loops. The new components are:

TFR-BERT in Orbit

tfr.keras.task: This module contains the general boilerplate code to train TF-Ranking models in the Orbit framework. Particularly, there are:

RankingDataLoader, which parses an ELWC formatted data record into tensors

RankingTask, which specifies the behaviors of each training and evaluation step, as well as the training losses and evaluation metrics.

In addition, there are config data classes like RankingDataConfig and RankingTaskConfig to store configurations for above classes.

tfr.keras.premade.tfrbert_task: This module contains the TFR-BERT specification of the TF-Ranking Orbit task.

TFRBertDataLoader, which subclasses the RankingDataLoader and further specifies the feature specs of a TFR-BERT model.

TFRBertScorer and TFRBertModelBuilder, which defines a model builder that can create a TFR-BERT ranking model as a Keras model, based on tf-models’ implementation of BERT encoder.

TFRBertTask, which is a subclass of RankingTask. It defines the build_model behavior. It also defines the initialization method which would load an pretrained BERT checkpoint to initialize the encoder. It also provides the function to output the prediction results along with query ids and document ids.

In addition, there are config data classes like TFRBertDataConfig, TFRBertModelConfig and TFRBertConfig which stores configurations for above classes.

examples/keras/tfrbert_antique_train.py: This file provides an example of training a TFR-BERT model on the Antique data set. There is also an .yaml file where users can specify parameter configurations.

Dependencies: The following packages will be installed as required when installing tensorflow-ranking.

tf-models-official >= 2.5.0

tensorflow-serving-api>= 2.0.0, < 3.0.0

tensorflow==2.5.0.

Source code(tar.gz)
Source code(zip)
v0.4.0(May 25, 2021)
This release is one of the major releases for TF-Ranking. It provides full support to build and train a native Keras model for ranking problems. It includes necessary Keras layers for a ranking model, a module to construct a model in a flexible manner, and a pipeline to train a model with minimal boilerplate. To get started, please follow the example here. In addition, the new release adds RaggedTensor support in losses and metrics and we provide a handy example to show how to use it in a ranking model.

The new components are listed below:

Keras Layers:

Use input packing for layer signatures for SavedModel compatibility.

create_tower function to create a feedforward neural network with batch normalization and dropout.

GAMLayer, a Keras layer which implements the neural generalized additive ranking model.

Update build method of DocumentInteractionAttention layer to ensure SavedModel is restored correctly.

ModelBuilder to build tf.keras.Model using Functional API:

AbstractModelBuilder class for users to inherit.

ModelBuilder class that wraps the boilerplate code to build tf.keras.Model for a ranking model.

InputCreator abstract class to implement create_inputs in ModelBuilder.

FeatureSpecInputCreator class to create inputs from feature_specs.

TypeSpecInputCreator class to create inputs from type_specs.

Preprocessor abstract class to implement preprocess in ModelBuilder.

PreprocessorWithSpec class to do Keras preprocessing or feature transformations with functions specified in Specs.

Scorer abstract class to implement score in ModelBuilder.

UnivariateScorer class to implement univariate scoring functions.

DNNScorer class to implement fully connected DNN univariate scoring.

GAMScorer class to implement feature based GAM univariate scoring.

Pipeline to wrap the boilerplate codes for training:

AbstractDatasetBuilder abstract class to build and serve the dataset for training.

BaseDatasetBuilder class to build training and validation datasets and signatures for SavedModel from feature_specs.

SimpleDatasetBuilder class to build datasets with a single label feature spec.

MultiLabelDatasetBuilder class to build datasets for multi-task learning.

DatasetHparams dataclass to specify all hyper-parameters used in BaseDatasetBuilder class.

AbstractPipeline abstract class to train and validate the ranking tf.keras.Model.

ModelFitPipeline class to train the ranking models using model.fit() compatible with distribution strategies.

SimplePipeline class for single-task training.

MultiTaskPipeline class for multi-task training.

An example client to showcase training a deep neural network model with a distribution strategy using SimplePipeline.

PipelineHparams dataclass to specify all hyper-parameters used in ModelFitPipeline class.

strategy_utils helper module to support tf.distribute strategies.

RaggedTensor support in losses and metrics:

Losses in tfr.keras.losses and metrics in tfr.keras.metrics support to act on tf.RaggedTensor inputs. To do so, set argument ragged=True when defining the loss and metric objects:

E.g.: loss = tf.keras.losses.SoftmaxLoss(name=’softmax_loss’, ragged=True)

Add this argument in get to get the losses and metrics support ragged tensors: loss = tf.keras.losses.get(‘softmax_loss’, ragged=True)

An example client to showcase training a deep neural network model using model.fit() with ragged inputs and outputs.

Dependencies: The following packages will be installed as required when installing tensorflow-ranking. tf-models-official >= 2.5.0 tensorflow-serving-api>= 2.0.0, < 3.0.0 tensorflow==2.5.0.
Source code(tar.gz)
Source code(zip)
v0.3.3(Feb 2, 2021)
This is the 0.3.3 release of TensorFlow Ranking. It depends on tf-models-official >= 2.4.0 and tensorflow-serving-api>= 2.0.0, < 3.0.0. It is compatible with tensorflow==2.4.1. All of these packages will be installed as required packages when installing tensorflow-ranking.

The main changes in this release contain the Document Interaction Network (DIN) layer and layers for training Keras models using Functional API. The new components are listed below:

Document Interaction Network: See paper.

Building Keras ranking models for DIN using Keras Preprocessing Layers.

Native Keras training: An example client to showcase such a model using model.fit().

Estimator based training: Another example client to showcase training a DIN model as an Estimator.

tfr.keras.layers.DocumentInteractionAttention: A keras layer to model cross-document interactions. Applies cross-document attention across valid examples identified using a mask.

Keras Layers: for easy transformation of context and example features and related utilities.

tfr.keras.layers.FlattenList: Flattens the batch_size dimension and the list_size dimension for the example_features and expands list_size times for the context_features.

tfr.keras.layers.ConcatFeatures: Concatenates context features and example features in a listwise manner.

tfr.keras.layers.RestoreList: Output layer to restore batch_size dimension and list_size dimension for the output shape of logits.

Others

tfr.keras.metrics.get(metric_key): Add a get metric factory for keras metrics.

Masking support in tfr.data: Add support for parsing a boolean mask tensor which indicates number of valid examples via mask_feature_name argument in tfr.data._RankingDataParser and all associated input data parsing and serving_input_fn builders.

Source code(tar.gz)
Source code(zip)
v0.3.2(Aug 19, 2020)

In the latest release of TensorFlow Ranking v0.3.2, we introduce TFR-BERT extension to better support ranking models for text data based on BERT. BERT is a pre-trained language representation model which has achieved substantial improvement over numerous NLP tasks. We find that fine-tuning BERT with ranking losses further improve the ranking performance (arXiv). You can read detailed information about what is included in TFR-BERT extension here. There is also an example showing how to use TFR-BERT here.
Source code(tar.gz)
Source code(zip)
v0.3.1(Jun 1, 2020)
This is the 0.3.1 release of TensorFlow Ranking. It depends on tensorflow-serving-api==2.1.0 and is fully compatible with tensorflow==2.2.0. Both will be installed as required packages when installing tensorflow-ranking.

The main changes in this release are canned Neural RankGAM estimator, canned DNN estimators, canned Neural RankGAM keras models and their examples. The new components are:

Neural RankGAM

make_gam_ranking_estimator, which makes a canned estimator of a neural ranking generalized additive model. An example client to showcase the usage of make_gam_ranking_estimator.

make_dnn_ranking_estimator, which makes a canned estimator of a feed-forward neural network for ranking. An example client to showcase the usage of make_dnn_ranking_estimator.

GAMRankingNetwork, which encapsulates Neural RankGAM models in a keras ranking network. An example client to showcase the usage of GAMRankingNetwork.

Keras

Add serialization and deserialization support for feature columns via tfr.keras.feature.serialize_feature_columns and tfr.keras.feature.deserialize_feature_columns. RankingNetwork’s get_config and from_config methods rely on this.

Source code(tar.gz)
Source code(zip)
v0.3.0(Mar 24, 2020)
This is the 0.3.0 release of TensorFlow Ranking. It depends on tensorflow-serving-api==2.1.0 and is fully compatible with tensorflow==2.1.0. Both will be installed as required packages when installing tensorflow-ranking.

The main changes in this release are related to the DNN Estimator Builder and Keras APIs.

A DNN Estimator Builder is available at tfr.estimator.make_dnn_ranking_estimator().

For Keras, we provide an example to showcase the use of Keras APIs to build ranking models , and a documentation providing step-by-step user instructions outlining the Keras user journey.

The new Keras components are:

Losses: Ranking losses in Keras object oriented loss format, along with a base class and a factory method. The APIs are:

tfr.keras.losses.get(loss_key)

Base class: tfr.keras.losses._RankingLoss

Losses under tfr.keras.losses.*

Metrics: Ranking metrics in Keras object oriented metric format, along with a base class and a default metrics getter method. The APIs are:

tfr.keras.metrics.get_default_metrics()

Base class: [tfr.keras.metrics._RankingMetric`

Metrics under tfr.keras.metrics.*

Feature Transformations: tfr.keras.feature.EncodeListwiseFeatures, to convert sparse ranking features to dense. The APIs are:

tfr.keras.feature.EncodeListwiseFeatures

Ranking Network: Base classes for building Ranking Networks, which define scoring logic. The APIs are:

tfr.keras.network.RankingNetwork

tfr.keras.network.UnivariateRankingNetwork

Premade Networks: We support premade architectures users can access out-of-the-box. The APIs are:

tfr.keras.canned.DNNRankingNetwork

Keras Model : Ranking models can be built using Keras Functional Model API. The APIs are:

tfr.keras.model.create_keras_model()

Integration with Estimators and RankingPipeline: Keras model can be converted to Estimator to use Estimator’s training utilities and is compatible with RankingPipeline. The APIs for conversion are:

tfr.keras.estimator.model_to_estimator()

Source code(tar.gz)
Source code(zip)
v0.2.3(Mar 6, 2020)
This is the 0.2.3 release of TensorFlow Ranking. It depends on tensorflow-serving-api==2.1.0 and is fully compatible with tensorflow==2.1.0. Both will be installed as required packages when installing tensorflow-ranking.

The main changes in this release are:

Added an EstimatorBuilder Class to encapsulate boilerplate codes when constructing a TF-ranking model Estimator. Clients can access it via tfr.estimator.EstimatorBuilder.

Added a RankingPipeline Class to hide the boilerplate codes regarding the train and eval data reading, train and eval specs definition, dataset building, exporting strategies. With this, clients can construct a RankingPipeline object using tfr.ext.pipeline.RankingPipeline and then call train_and_eval() to run the pipeline.

Provided an example to demo the use of tfr.ext.pipeline.RankingPipeline.

Source code(tar.gz)
Source code(zip)
v0.2.2(Jan 17, 2020)
This is the 0.2.2 release of TensorFlow Ranking. It depends on tensorflow-serving-api==2.1.0 and is fully compatible with tensorflow==2.1.0. Both will be installed as required packages when installing tensorflow-ranking. The main changes in this release are:

Fixed metric computation to include lists without any relevant examples.

Updated demo code to be TF 2.1.0 compatible.

Replaced deprecated dataset.output_dtypes with tf.compat.v1.get_output_dtypes(dataset).

Source code(tar.gz)
Source code(zip)
v0.2.1(Dec 18, 2019)
This is the 0.2.1 release of TensorFlow Ranking. It depends on tensorflow-serving-api==2.0.0 and is fully compatible with tensorflow==2.0.0. Both will be installed as required packages when installing tensorflow-ranking.

The main changes in this release are:

Updated demo code to use Antique data in ELWC format.

Updated tutorial script to demonstrate using weights in metrics and losses.

Removed LIBSVM generator from tfr.data and updated the docs.

Make gain and discount parameters in the definition of NDCG configurable.

Added MAP as a ranking metric.

Added a topn parameter to MRR metric.

Source code(tar.gz)
Source code(zip)
v0.2.0(Oct 22, 2019)

This is the 0.2.0 release of TensorFlow Ranking. It depends on tensorflow-serving-api>=2.0.0 and is fully compatible with tensorflow==2.0.0. Both will be installed as required packages when installing tensorflow-ranking.

There is no new functionality added compared with v0.1.6. This release marks a milestone that our future development will be based on TensorFlow 2.0.
Source code(tar.gz)
Source code(zip)
v0.1.6(Oct 22, 2019)
This is the 0.1.6 release of TensorFlow Ranking. We add the dependency to tensorflow-serving-api to use tensorflow.serving.ExampleListWithContext as our input data format. It is tested and stable against TensorFlow 1.15.0 and TensorFlow 2.0.0. The main changes in this release are:

Support tensorflow.serving.ExampleListWithContext as our input data format (commit). This is a more user-friendly format than the ExampleInExample one.

Add a demo script for data stored in TFRecord. The stored format can be ExampleListhWithContext or other format defined in data.py.

Source code(tar.gz)
Source code(zip)
v0.1.5(Sep 24, 2019)
This is the 0.1.5 release of TensorFlow Ranking. It is tested and stable against TensorFlow version 1.14.0 and TensorFlow version 2.0 RC0. The main changes in this release are:

Support for Multi-Task Learning and Multi-Objective Learning (Issue #85).

Deprecate the input_size argument for tfr.feature. encode_listwise_features and infer it automatically in the function.

Fix the weighted mrr computation for doc-level weights.

Source code(tar.gz)
Source code(zip)
v0.1.4(Sep 5, 2019)
This is the 0.1.4 release of TensorFlow Ranking. It is tested and stable against TensorFlow version 1.14.0 and TensorFlow version 2.0 RC0. The main changes in this release are:

Documentation for APIs. List of symbols/operations are available here.

Demo for using sparse and embedded features on ANTIQUE dataset.

Example for prediction using ranking estimator in demo code.

Code and test cases are fully TF2.0 RC0 compatible.

Updated tfr.utils.sort_by_scores to break ties.

Added ApproxMRR loss function.

Announcement: A hands-on tutorial for TF-Ranking, with relevant theoretical background will be presented on Oct 2 at ICTIR 2019, hosted in Santa Clara, CA. Please consider attending!
Source code(tar.gz)
Source code(zip)
v0.1.3(Jun 20, 2019)
This is the 0.1.3 release of TensorFlow Ranking. It is tested and stable against TensorFlow version 1.14.0. The main changes in this release are:

Introduced an ExampleInExample data format.

Introduced a factory method to build tf.dataset in different data formats.

Introduced a factory method to build serving receiving input functions for different data formats.

Refactored the main modules to be object-oriented to increase the code extensibility.

Source code(tar.gz)
Source code(zip)

TensorFlow Ranking is a library for Learning-to-Rank (LTR) techniques on the TensorFlow platform

Related tags

Overview

TensorFlow Ranking

Tutorial Slides

Demos

Linux Installation

Stable Builds

Installing from Source

Running Scripts

TFRecord Example

LIBSVM Example

TensorBoard

Jupyter Notebook

References

Citation

Comments

Details

TF framework module versions

Some training parameters and functions

Creating the serving_input_receiver_fn

Releases(v0.5.1)

v0.5.1(Oct 26, 2022)

v0.5.0(Nov 16, 2021)

v0.4.2(Jul 22, 2021)

v0.4.0(May 25, 2021)

v0.3.3(Feb 2, 2021)

v0.3.2(Aug 19, 2020)

v0.3.1(Jun 1, 2020)

v0.3.0(Mar 24, 2020)

v0.2.3(Mar 6, 2020)

v0.2.2(Jan 17, 2020)

v0.2.1(Dec 18, 2019)

v0.2.0(Oct 22, 2019)

v0.1.6(Oct 22, 2019)

v0.1.5(Sep 24, 2019)

v0.1.4(Sep 5, 2019)

v0.1.3(Jun 20, 2019)

Owner

Transfer-Learn is an open-source and well-documented library for Transfer Learning.

A Machine Teaching Framework for Scalable Recognition

PyTorch implementation of DeepUME: Learning the Universal Manifold Embedding for Robust Point Cloud Registration (BMVC 2021)

BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process.

"NAS-Bench-301 and the Case for Surrogate Benchmarks for Neural Architecture Search".

An End-to-End Machine Learning Library to Optimize AUC (AUROC, AUPRC).

[NeurIPS 2021] Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

U-Time: A Fully Convolutional Network for Time Series Segmentation

A code generator from ONNX to PyTorch code

A Tensorfflow implementation of Attend, Infer, Repeat

a Lightweight library for sequential learning agents, including reinforcement learning

Compact Bilinear Pooling for PyTorch

Spatial Attentive Single-Image Deraining with a High Quality Real Rain Dataset (CVPR'19)

Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks

EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale

PyTorch implementation of federated learning framework based on the acceleration of global momentum

Learning to Reach Goals via Iterated Supervised Learning

Official PyTorch implementation of Data-free Knowledge Distillation for Object Detection, WACV 2021.

HairCLIP: Design Your Hair by Text and Reference Image