본문 바로가기

논문 리뷰3

[논문 리뷰] Zero-Shot Text-to-Image Generation (DALL-E) https://arxiv.org/abs/2102.12092 Zero-Shot Text-to-Image GenerationText-to-image generation has traditionally focused on finding better modeling assumptions for training on a fixed dataset. These assumptions might involve complex architectures, auxiliary losses, or side information such as object part labels or segmentatiarxiv.org 연구 목적기존의 작은 데이터셋에서 벗어나, 대규모 데이터 & 파라미터를 활용한 → 고품질 이미지 생성Backgroun.. 2025. 1. 31.
[논문 리뷰] BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/abs/1810.04805 BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingWe introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations from unlaarxiv.org  IntroductionBERT 이.. 2025. 1. 31.
[논문 리뷰] InstructPix2Pix: Learning to Follow Image Editing Instructions https://arxiv.org/abs/2211.09800 InstructPix2Pix: Learning to Follow Image Editing InstructionsWe propose a method for editing images from human instructions: given an input image and a written instruction that tells the model what to do, our model follows these instructions to edit the image. To obtain training data for this problem, we combine thearxiv.orgBackgroundDiffusion Model,Conditional .. 2025. 1. 31.