自定义博客皮肤VIP专享

*博客头图:

格式为PNG、JPG,宽度*高度大于1920*100像素,不超过2MB,主视觉建议放在右侧,请参照线上博客头图

请上传大于1920*100像素的图片!

博客底图:

图片格式为PNG、JPG,不超过1MB,可上下左右平铺至整个背景

栏目图:

图片格式为PNG、JPG,图片宽度*高度为300*38像素,不超过0.5MB

主标题颜色:

RGB颜色,例如:#AFAFAF

Hover:

RGB颜色,例如:#AFAFAF

副标题颜色:

RGB颜色,例如:#AFAFAF

自定义博客皮肤

-+
  • 博客(33)
  • 资源 (1)
  • 问答 (1)
  • 收藏
  • 关注

原创 往期文章集合目录

Logistic Regression, L1, L2 regularization, Gradient/Coordinate descent 详细MLE v.s. MAP ~ L1, L2 Math Derivation 详细XGBoost math Derivation 通俗易懂的详细推导Introduction to Convex Optimization Basic Concept...

2020-04-14 00:44:41 1632

原创 Decoupling Representation and Classifier for Long-Tailed Recognition 图像领域长尾分布分类问题方法

文章目录IntroductionRecent DirectionsSampling StrategiesMethods of Learning ClassifiersClassifier Re-training (cRT)Nearest Class Mean classifier (NCM)τ\tauτ-normalized classifier (τ(\tau(τ-normalized)ExperimentsDatasetsEvaluation ProtocolResultsSampling matter

2021-01-08 11:48:24 1068

原创 Relation Extraction 关系抽取综述

文章目录往期文章链接目录Information Extraction v.s. Relation ExtractionExisting Works of REPattern-based MethodsStatistical Relation Extraction ModelsNeural Relation Extraction MethodsFuture DirectionsUtilizing More DataMethods to Denoise DS DataOpen Problem for Utili

2021-01-03 12:32:42 1723

原创 Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs 关系抽取论文总结

文章目录往期文章目录链接Relation Extraction (RE)document-level REIntuitionContributionOverview of Proposed ModelProposed ModelSentence Encoding LayerGraph construction LayerNode ConstructionEdge ConstructionInference LayerFirst StepSecond StepClassification LayerResul

2020-12-31 08:47:50 840

原创 跨语言学习归纳总结 Cross-Lingual Learning paper summary

往期文章链接目录文章目录往期文章链接目录Cross-lingual learningCross-lingual resourcesMultilingual distributional representationsEvaluation of multilingual distributional representationsParallel corpusWord AlignmentsMachine TranslationUniversal features (out of fashion)Biling

2020-12-28 14:14:15 1535

原创 BERT and RoBERTa 知识点整理

往期文章链接目录文章目录往期文章链接目录BERT RecapOverviewBERT SpecificsThere are two steps to the BERT framework: pre-training and fine-tuningInput Output RepresentationsTasksresultsAblation studiesEffect of Pre-training TasksEffect of Model SizesReplication study of BERT p

2020-09-18 12:11:09 1659 1

原创 What Does BERT Look At? An Analysis of BERT’s Attention 论文总结

文章目录往期文章链接目录Before we startSurface-Level Patterns in AttentionProbing Individual Attention HeadsProbing Attention Head CombinationsClustering Attention Heads往期文章链接目录往期文章链接目录Before we startIn this post, I mainly focus on the conclusions the authors reach

2020-09-14 09:38:27 2054

原创 The More You Know: Using Knowledge Graphs for Image Classification 论文总结

文章目录往期文章链接目录OverviewIntuitionPrevious WorkMajor ContributionGraph Search Neural Network (GSNN)GSNN ExplanationThree networksDiagram visualizationAdvantageIncorporate the graph network into an image pipelineDatasetConclusion往期文章链接目录往期文章链接目录OverviewThis p

2020-09-02 08:54:10 2519 2

原创 Graph Convolutional Neural Network - Spectral Convolution 图卷积神经网络 — 频域卷积详解

Fourier TransformVirtually everything in the world can be described via a waveform - a function of time, space or some other variable. For instance, sound waves, the price of a stock, etc. The Fourier Transform gives us a unique and powerful way of viewin

2020-08-24 08:49:54 2300 4

原创 Graph Convolutional Neural Network - Spatial Convolution 图卷积神经网络 — 空域卷积详解

Convolutional graph neural networks (ConvGNNs)Convolutional graph neural networks (ConvGNNs) generalize the operation of convolution from grid data to graph data. The main idea is to generate a node vvv’s representation byaggregating its own features xv\

2020-08-20 08:40:58 5069 1

原创 Introduction to Graph Neural Network (GNN) 图神经网络入门详解

文章目录往期文章链接目录NoteBackground and IntuitionIntro to Graph Neural NetworksGNNs FrameworkDefinitionRecurrent graph neural networks (RecGNNs)Introduction to RecGNNsBanach's Fixed Point TheoremRecGNNs v.s. RNNsLimitation of RecGNNsGated Graph Neural Networks (GGN

2020-08-17 08:37:40 3377

原创 Kaggle: Jigsaw Multilingual Toxic Comment Classification Top Solutions 金牌思路总结

Before we startTwo of my previous post might be helpful in getting a general understanding of the top solutions of this competition. Please feel free to check them out.Knowledge Distillation clearly explainedCommon Multilingual Language Modeling method

2020-08-11 08:45:49 1936

原创 常见多语言模型详解 (M-Bert, LASER, MultiFiT, XLM)

文章目录往期文章链接目录Ways of tokenizationWord-based tokenizationCharacter-based tokenizationSubword tokenizationExisting approaches for cross-lingual NLPOut-of-vocabulary (OOV) problem in mono/multi-lingual settingsM-BERT (Multi-lingual BERT)WHY MULTILINGUAL BERT W

2020-08-08 07:45:48 6686

原创 Knowledge Distillation 知识蒸馏详解

文章目录往期文章链接目录Shortcoming of normal neural networksGeneralization of InformationKnowledge DistillationA few DefinitionsGeneral idea of knowledge distillationTeacher and StudentTemperature & EntropyTraining the Distil Model往期文章链接目录往期文章链接目录Currently, esp

2020-08-05 06:59:54 2337

原创 Kaggle: Tweet Sentiment Extraction 方法总结 Part 2/2: 金牌思路总结

Before we startI attended two NLP competition in June, Tweet Sentiment Extraction and Jigsaw Multilingual Toxic Comment Classification, and I’m happy to be a Kaggle Expert from now on ????Tweet Sentiment ExtractionGoal:The objective in this competitio

2020-07-01 12:07:22 1121 9

原创 Kaggle: Tweet Sentiment Extraction 方法总结 Part 1/2: 常用方法总结

文章目录往期文章目录链接NoteBefore we startTweet Sentiment ExtractionWhat is the MAGIC?Common MethodsLabel SmoothingImplementation of Label SmoothingIn tensorflowIn pytorchMulti-sample dropoutImplementationStochastic Weight Averaging (SWA)Different learning rate setti

2020-07-01 12:06:29 1932 2

原创 RNN, LSTM 图文详解

文章目录往期文章链接目录Sequence DataWhy not use a standard neural network for sequence tasksRNNDifferent types of RNNsLoss function of RNNBackpropagation through timeVanishing gradients with RNNsAdvantages and Drawbacks of RNNLSTMTypes of gatesformulas and illustrati

2020-06-04 11:44:02 1227

原创 Intro to Deep Learning & Backpropagation 深度学习模型介绍及反向传播算法推导详解

文章目录Deep Neural Network往期文章链接目录Forward PropagationLoss functions of neural networkBack-propagationcompute ∂ℓ∂f(x)\frac{\partial \ell}{\partial f(x)}∂f(x)∂ℓ​compute ∂ℓ∂a(L+1)(x)\frac{\partial \ell}{\partial a^{(L+1)}(x)}∂a(L+1)(x)∂ℓ​compute ∂ℓ∂h(k)(x)\frac

2020-05-26 04:29:34 564

原创 Log-Linear Model & CRF 条件随机场详解

文章目录往期文章链接目录Log-Linear modelConditional Random Fields (CRF)Formal definition of CRFLog-linear model to linear-CRFInference problem for CRFLearning problem for CRFLearning problem for general Log-Linear modelLearning problem for CRFCompute Z(xˉ,w)Z(\bar x,

2020-05-19 13:15:11 778

原创 GMM & K-means 高斯混合模型和K-means聚类详解

往期文章链接目录文章目录往期文章链接目录Gaussian mixture model (GMM)Interpretation from geometryInterpretation from mixture modelGMM Derivationset upSolve by MLESolve by EM AlgorithmK-means往期文章链接目录Gaussian mixture model (GMM)A Gaussian mixture model is a probabilistic mode

2020-05-16 08:18:16 1163

原创 Probabilistic Graphical Model (PGM) 概率图模型框架详解

往期文章链接目录Probabilistic Graphical Model (PGM)Definition: A probabilistic graphical model is a probabilistic model for which a graph expresses the conditional dependence structure between random variables.In general, PGM obeys following rules:Sum Rul

2020-05-11 02:50:37 2677 2

原创 Hidden Markov Model (HMM) 详细推导及思路分析

往期文章链接目录Before reading this post, you should be familiar with the EM Algorithm and decent among of knowledge of convex optimization. If not, check out my previous postEM Algorithmconvex optimiz...

2020-05-03 03:32:13 1610 1

原创 EM (Expectation–Maximization) Algorithm 思路分析及推导

往期文章链接目录Jensen’s inequalityTheorem: Let fff be a convex function, and let XXX be a random variable. Then:E[f(X)]≥f(E[X])E[f(X)] \geq f(E[X])E[f(X)]≥f(E[X])\quad Moreover, if fff is strictly con...

2020-04-24 05:38:18 947 3

原创 干货: Skip-gram 详细推导加分析

往期文章链接目录Comparison between CBOW and Skip-gramThe major difference is that skip-gram is better for infrequent words than CBOW in word2vec. For simplicity, suppose there is a sentence “w1w2w3w4w_1w_2...

2020-04-17 11:57:36 2085

原创 Distributed representation, Hyperbolic Space, Gaussian/Graph Embedding 详细介绍

往期文章链接汇总Overview of various word representation and Embedding methodsLocal Representation v.s. Distributed RepresentationOne-hot encoding is local representation and is good for local generalizati...

2020-04-17 11:43:54 1075

原创 NLP基础概览 + Spell Correction with Noisy Channel

NLP = NLU + NLGNLU: Natural Language UnderstandingNLG: Natural Language GenerationNLG may be viewed as the opposite of NLU: whereas in NLU, the system needs to disambiguate the input sentence to ...

2020-04-10 12:32:07 1484

原创 Kaggle: Google Quest Q&A Labeling 首战银牌方法总结+心得

fesfeee

2020-04-04 06:18:57 1255

原创 SVM/ Dual SVM math derivation, non-linear SVM, kernel function详细

Linear SVMIdea:We want to find a hyper-plane w⊤x+b=0w^\top x + b = 0w⊤x+b=0 that maximizes the margin.Set up:We first show that the vector www is orthogonal to this hyper-plane. Let x1x_1x1​, x2x...

2020-03-28 00:51:02 967

原创 Convex Optimization: Primal Problem to Dual problem clearly explained 详细

Consider an optimization problem in the standard form (we call this a primal problem):We denote the optimal value of this as p⋆p^\starp⋆. We don’t assume the problem is convex.The Lagrange dual fun...

2020-03-27 23:57:28 1276

原创 Introduction to Convex Optimization Basic Concepts 详细

Optimization problemAll optimization problems can be written as:Optimization Categoriesconvex v.s. non-convexDeep Neural Network is non-convexcontinuous v.s.discreteMost are continuous vari...

2020-03-27 23:15:27 505

原创 XGBoost math Derivation 通俗易懂的详细推导

Bagging v.s. Boosting:Bagging:Leverages unstable base learners that are weak because of overfitting.Boosting:Leverages stable base learners that are weak because of underfitting.XGBoostLearning ...

2020-03-27 14:19:53 610

原创 MLE, MAP 对比及 MAP 转换到 L1, L2 norm 的 Math Derivation 详细

MLE v.s. MAPMLE: learn parameters from data.MAP: add a prior (experience) into the model; more reliable if data is limited. As we have more and more data, the prior becomes less useful.As data inc...

2020-03-27 13:14:04 426

原创 Logistic Regression, L1, L2 regularization, Gradient/Coordinate descent 详细

Generative model v.s. Discriminative model:Examples:Generative model: Naive Bayes, HMM, VAE, GAN.Discriminative model:Logistic Regression, CRF.Obejective function:Generative model: max  p (x,y...

2020-03-27 12:39:18 896

node2vec.pdf

Prediction tasks over nodes and edges in networks require careful effort in engineering features used by learning algorithms. Recent research in the broader field of representation learning has led to significant progress in automating prediction by learning the features themselves. However, present feature learning approaches are not expressive enough to capture the diversity of connectivity patterns observed in networks.

2020-05-05

TA创建的收藏夹 TA关注的收藏夹

TA关注的人

提示
确定要删除当前文章?
取消 删除