Using Fast KNN for an image captioning task
-
Updated
Mar 15, 2024 - Jupyter Notebook
Using Fast KNN for an image captioning task
PyTorch implementation of Conditional Generative Adversarial Networks (cGAN) for image colorization of the MS COCO dataset
MS-COCO-ES is a dataset created from the original MS-COCO dataset. This project aims to provide a small subset of the original image captions translated into Spanish by humans annotators. This subset is composed by 20,000 captions of 4,000 images.
SacreEOS experiments
Course project for COMP 6130 Data Mining, Summer'24, Auburn University
This project extends the existing Mask-RCNN code to generate a Blum Medial Axis from a natural RGB image.
Python codes to implement DeMix, a DETR assisted CutMix method for image data augmentation
[AAAI 2024] Official code for "Hyp-OW: Exploiting Hierarchical Structure Learning with Hyperbolic Distance Enhances Open World Object Detection"
Performed object detection and logging time periods by deploying YOLO-V3 with transfer learning and fine tuning classifications for all layers of the network. The model is fine-tuned the model using the pre-trained MS-COCO weights and accordingly modified the same for custom dataset.
Reproduction of LaVisE: Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention
COCO-Stuff dataset for huggingface datasets
A collection of semantic segmentation approaches
Used deep learning to train a CNN + RNN/LSTM on the MS-COCO dataset to automatically generate captions.
labeling tool that allows easy plugin of detection networks that can assist in the labeling process
Multi-Auto-Annotate : Automatically annotate multiple labels in your entire image directory by a single command. Works with COCO dataset and also has the ability to train on custom dataset.
Deep Learning based project developed using YOLO-v5 (You Only Look Once) which helps to detect and recognize the obstacles for Autonomous Vehicles.The model developed also estimates the distance of each obstacle from initial position considered.
Python dictionary storing object tags for MS-COCO images. Data from 3 different sources (COCO ground truths, VG classifier and Microsoft's VinVL) are availible.
A system to process visual input on timed frames to produce sensible audio aid in accordance with human information processing limits, using image captioning, semantic text comparison and text-to-speech modules.
Vision Based Document Layout Detection, Segmentation and context classification using MaskRCNN on Tensorflow-Keras, PyTorch & Detectron2.
Intelligent Advertisement Generation for e-commerce websites using deep learning.
Add a description, image, and links to the ms-coco topic page so that developers can more easily learn about it.
To associate your repository with the ms-coco topic, visit your repo's landing page and select "manage topics."