welcome to x-jeff blog

【论文阅读】Enriching Variety of Layer-wise Learning Information by Gradient Combination

PRN

本文为原创文章，未经本人允许，禁止转载。转载请注明出处。 1.Introduction 根据以往的研究工作，提高深度卷积神经网络性能的策略分为两个方面：如何组合特征并将其传播到后续层。如何使梯度更高效地传播到所有层。我们提出一个新的角度：在训练过程中，如何组合各层的梯度以获得更好的学习效果。因此，我们提出了PRN（partial residual netwo...

Posted by x-jeff on April 24, 2025

【LLM】ChatGPT Prompt Engineering for Developers

面向开发者的提示工程

本文为参考DeepLearning.AI的”ChatGPT Prompt Engineering for Developers”课程所作的个人笔记。课程地址：https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers/。本文为原创文章，未经本人允许，禁止转载。转载...

Posted by x-jeff on April 18, 2025

【论文阅读】An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection

OSA，MAC，VoVNet

本文为原创文章，未经本人允许，禁止转载。转载请注明出处。 1.Introduction 在我们的实验中（见表4），我们发现基于DenseNet的检测器相比基于ResNet的检测器，前者参数量和计算量更小且性能更高。 ResNet和DenseNet之间的主要区别在于它们聚合特征的方式不同。ResNet通过相加的方式聚合来自浅层的特征，这可能会导致浅层的feature map所携带...

Posted by x-jeff on April 14, 2025

【机器学习基础】第六十一课：[半监督学习]半监督聚类

约束k均值算法，约束种子k均值算法

【机器学习基础】系列博客为参考周志华老师的《机器学习》一书，自己所做的读书笔记。本文为原创文章，未经本人允许，禁止转载。转载请注明出处。 1.半监督聚类聚类是一种典型的无监督学习任务，然而在现实聚类任务中我们往往能获得一些额外的监督信息，于是可通过半监督聚类（semi-supervised clustering）来利用监督信息以获得更好的聚类效果。聚类任务中获得的监督信息...

Posted by x-jeff on April 13, 2025

【机器学习基础】第六十课：[半监督学习]基于分歧的方法

基于分歧的方法

【机器学习基础】系列博客为参考周志华老师的《机器学习》一书，自己所做的读书笔记。本文为原创文章，未经本人允许，禁止转载。转载请注明出处。 1.基于分歧的方法与生成式方法、半监督SVM、图半监督学习等基于单学习器利用未标记数据不同，基于分歧的方法（disagreement-based methods）使用多学习器，而学习器之间的“分歧”（disagreement）对未标记数据的...

Posted by x-jeff on April 12, 2025

【从零开始构建大语言模型】【7】【Fine-tuning to follow instructions】

Introduction to instruction fine-tuning，Preparing a dataset for supervised instruction fine-tuning，Organizing data into training batches，Creating data loaders for an instruction dataset，Loading a pretrained LLM，Fine-tuning the LLM on instruction data，Extracting and saving responses，Evaluating the fine-tuned LLM

【从零开始构建大语言模型】系列博客为”Build a Large Language Model (From Scratch)”一书的个人读书笔记。原书链接：Build a Large Language Model (From Scratch)。官方示例代码：LLMs-from-scratch。本文为原创文章，未经本人允许，禁止转载。转载请注明出...

Posted by x-jeff on March 31, 2025

【从零开始构建大语言模型】【6】【Fine-tuning for classification】

Different categories of fine-tuning，Preparing the dataset，Creating data loaders，Initializing a model with pretrained weights，Adding a classification head，Calculating the classification loss and accuracy，Fine-tuning the model on supervised data，Using the LLM as a spam classifier

【从零开始构建大语言模型】系列博客为”Build a Large Language Model (From Scratch)”一书的个人读书笔记。原书链接：Build a Large Language Model (From Scratch)。官方示例代码：LLMs-from-scratch。本文为原创文章，未经本人允许，禁止转载。转载请注明出...

Posted by x-jeff on March 21, 2025

【从零开始构建大语言模型】【5】【Pretraining on unlabeled data】

Evaluating generative text models，Training an LLM，Decoding strategies to control randomness，Loading and saving model weights in PyTorch，Loading pretrained weights from OpenAI

【从零开始构建大语言模型】系列博客为”Build a Large Language Model (From Scratch)”一书的个人读书笔记。原书链接：Build a Large Language Model (From Scratch)。官方示例代码：LLMs-from-scratch。本文为原创文章，未经本人允许，禁止转载。转载请注明出...

Posted by x-jeff on March 13, 2025

【机器学习基础】第五十九课：[半监督学习]图半监督学习

图半监督学习

【机器学习基础】系列博客为参考周志华老师的《机器学习》一书，自己所做的读书笔记。本文为原创文章，未经本人允许，禁止转载。转载请注明出处。 1.图半监督学习本章节没太理解，在此仅作记录，相关公式的详细推导可参考南瓜书PumpkinBook。给定一个数据集，我们可将其映射为一个图，数据集中每个样本对应于图中一个结点，若两个样本之间的相似度很高（或相关性很强），则对应的...

Posted by x-jeff on March 9, 2025

【从零开始构建大语言模型】【4】【Implementing a GPT model from scratch to generate text】

Coding an LLM architecture，Normalizing activations with layer normalization，Implementing a feed forward network with GELU activations，Adding shortcut connections，Connecting attention and linear layers in a transformer block，Coding the GPT model，Generating text

【从零开始构建大语言模型】系列博客为”Build a Large Language Model (From Scratch)”一书的个人读书笔记。原书链接：Build a Large Language Model (From Scratch)。官方示例代码：LLMs-from-scratch。本文为原创文章，未经本人允许，禁止转载。转载请注明出...

Posted by x-jeff on March 3, 2025

x-jeff blog

【论文阅读】Enriching Variety of Layer-wise Learning Information by Gradient Combination

PRN

【LLM】ChatGPT Prompt Engineering for Developers

面向开发者的提示工程

【论文阅读】An Energy and GPU-Computation Efficient Backbone Network for Real-Time Object Detection

OSA，MAC，VoVNet

【机器学习基础】第六十一课：[半监督学习]半监督聚类

约束k均值算法，约束种子k均值算法

【机器学习基础】第六十课：[半监督学习]基于分歧的方法

基于分歧的方法

【从零开始构建大语言模型】【7】【Fine-tuning to follow instructions】

【从零开始构建大语言模型】【6】【Fine-tuning for classification】

Different categories of fine-tuning，Preparing the dataset，Creating data loaders，Initializing a model with pretrained weights，Adding a classification head，Calculating the classification loss and accuracy，Fine-tuning the model on supervised data，Using the LLM as a spam classifier

【从零开始构建大语言模型】【5】【Pretraining on unlabeled data】

Evaluating generative text models，Training an LLM，Decoding strategies to control randomness，Loading and saving model weights in PyTorch，Loading pretrained weights from OpenAI

【机器学习基础】第五十九课：[半监督学习]图半监督学习

图半监督学习

【从零开始构建大语言模型】【4】【Implementing a GPT model from scratch to generate text】

Coding an LLM architecture，Normalizing activations with layer normalization，Implementing a feed forward network with GELU activations，Adding shortcut connections，Connecting attention and linear layers in a transformer block，Coding the GPT model，Generating text

FEATURED TAGS

ABOUT ME