ReDS Lab

All

21 repositories

preference-learning-with-rationales
Public
This is the public repository for Data-Centric Human Preference Optimization with Rationales.
Python
•
Apache License 2.0
•0•0•0•0•Updated Jul 22, 2024Jul 22, 2024
SCOPE
Public
HTML
•
MIT License
•0•0•0•0•Updated Jul 20, 2024Jul 20, 2024
WokeyTalky
Public
HTML
•0•2•0•0•Updated Jul 12, 2024Jul 12, 2024
BEEAR
Public
This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models".
HTML
•1•6•0•0•Updated Jul 3, 2024Jul 3, 2024
Forward-INF
Public
Jupyter Notebook
•
Apache License 2.0
•0•2•0•0•Updated Jun 20, 2024Jun 20, 2024
Woke-Pipeline
Public
Python
•
MIT License
•0•0•0•0•Updated Jun 14, 2024Jun 14, 2024
LAVA
Public
This is an official repository for "LAVA: Data Valuation without Pre-Specified Learning Algorithms" (ICLR2023).
efficient ot optimal-transport model-agnostic data-valuation
Python
•
MIT License
•7•40•2•1•Updated Jun 5, 2024Jun 5, 2024
Nash-Meta-Learning
Public
Official implementation of "Fairness-Aware Meta-Learning via Nash Bargaining." We explore hypergradient conflicts in one-stage meta-learning and their impact on fairness. Our two-stage approach uses Nash bargaining to mitigate conflicts, enhancing fairness and model performance simultaneously.
Jupyter Notebook
•0•2•0•0•Updated May 15, 2024May 15, 2024
dataselection
Public
Projektor Website
JavaScript
•
MIT License
•0•0•0•0•Updated Dec 14, 2023Dec 14, 2023
projektor
Public
This is an official repository for "Performance Scaling via Optimal Transport: Enabling Data Selection from Partially Revealed Sources" (NeurIPS 2023).
projection performance-prediction data-selection scaling-law
Python
•
MIT License
•1•10•0•0•Updated Oct 26, 2023Oct 26, 2023
privmon
Public
This is an official repository for PrivMon: A Stream-Based System for Real-Time Privacy Attack Detection for Machine Learning Models (RAID 2023)
Python
•
MIT License
•0•4•0•0•Updated Oct 16, 2023Oct 16, 2023
CLIP-MIA
Public
This is an official repository for Practical Membership Inference Attacks Against Large-Scale Multi-Modal Models: A Pilot Study (ICCV2023).
Jupyter Notebook
•
MIT License
•2•20•2•0•Updated Sep 29, 2023Sep 29, 2023
2d-shapley
Public
This is an official repository for "2D-Shapley: A Framework for Fragmented Data Valuation" (ICML2023).
shapley data-valuation 2d-shapley
Jupyter Notebook
•
MIT License
•1•3•1•0•Updated Jul 27, 2023Jul 27, 2023
Trojan_Removal_Benchmark
Public
Python
•0•2•0•0•Updated Jul 3, 2023Jul 3, 2023
ASSET
Public
This repository is the official implementation of the paper "ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep Learning Paradigms." ASSET achieves state-of-the-art reliability in detecting poisoned samples in end-to-end supervised learning/ self-supervised learning/ transfer learning.
ai backdoor transfer-learning self-supervised-learning backdoor-attacks backdoor-defense aisecurity backdoor-detection
Python
•
MIT License
•0•17•2•0•Updated Jun 7, 2023Jun 7, 2023
Narcissus
Public
The official implementation of the CCS'23 paper, Narcissus clean-label backdoor attack -- only takes THREE images to poison a face recognition dataset in a clean-label way and achieves a 99.89% attack success rate.
adversarial-machine-learning adversarial-attacks ai-security backdoor-attacks deep-poisoning-attacks
Python
•
MIT License
•11•100•6•0•Updated May 9, 2023May 9, 2023
Meta-Sift
Public
The official implementation of USENIX Security'23 paper "Meta-Sift" -- Ten minutes or less to find a 1000-size or larger clean subset on poisoned dataset.
ai-security backdoor-attacks data-poisoning dataset-security
Python
•4•17•0•0•Updated Apr 27, 2023Apr 27, 2023
Universal_Pert_Cert
Public
This repo is the official implementation of the ICLR'23 paper "Towards Robustness Certification Against Universal Perturbations." We calculate the certified robustness against universal perturbations (UAP/ Backdoor) given a trained model.
Python
•
MIT License
•2•12•0•0•Updated Feb 14, 2023Feb 14, 2023
I-BAU
Public
Official Implementation of the ICLR 2022 paper, ``Adversarial Unlearning of Backdoors via Implicit Hypergradient''
Jupyter Notebook
•
MIT License
•13•2•0•0•Updated Apr 24, 2022Apr 24, 2022
frequency-backdoor
Public
The official implementation of the ICCV 2021 paper, "Rethinking the backdoor attacks' triggers: A frequency perspective."
Jupyter Notebook
•
MIT License
•6•1•0•0•Updated Nov 30, 2021Nov 30, 2021
Knowledge-Enriched-DMI
Public
The official implementation of the ICCV 2021 paper, "Knowledge-Enriched Distributional Model Inversion Attacks."
Python
•
MIT License
•11•3•0•0•Updated Nov 6, 2021Nov 6, 2021