K-means Clustering: Algorithm, Numeric Example, Drawbacks

October 5, 2021February 17, 2022 / noushin.gauhar / Leave a comment

ক্লাস্টারিং অ্যালগরিদমগুলোর মধ্যে সবচেয়ে সহজ এবং সবচেয়ে পপুলার হল K-means clustering algorithm. এই আর্টিকালে যা থাকছে - K-means Clustering কি?K-means Clustering এর অ্যালগরিদম স্টেপ বাই স্টেপK-means Clustering এর নিউমেরিক সল্ভK-means Clustering এর লিমিটেশন K-means Clustering কি?: K-means clustering algorithm হল একটা unsupervised learning algorithm। এইটা একটা partitioning algorithm, ডাটাসেটকে ‘k’ পার্টিশনে ভাগ করে। এই পার্টিশনগুলোই … Continue reading K-means Clustering: Algorithm, Numeric Example, Drawbacks

Outliers: Introduction to outliers & different types of outliers

May 8, 2021October 28, 2022 / noushin.gauhar / Leave a comment

ব্যাংকে প্রতিদিন লক্ষ লক্ষ ট্রান্সাকশন হয়। এতগুলো ট্রান্সাকশনের মধ্যে কোন ফ্রড হলে ধরা যায় কিভাবে? নরমালি যত ট্রান্সাকশন হয়, সব কোন না কোন একটা বৈশিষ্ট্য মেনে চলে। কিন্তু ফ্রড এই বৈশিষ্ট্যের বাইরে পরে - এই ব্যাপারটি কাজে লাগিয়ে ফ্রড ডিটেক্ট করা হয়। এইখানে ফ্রড হচ্ছে আউটলায়ার। এই আর্টিকেলে যা যা আলোচিত হবে: Outlier কি?Outlier detection … Continue reading Outliers: Introduction to outliers & different types of outliers

Construction of Decision Tree: Gain ratio

January 11, 2021January 11, 2021 / noushin.gauhar / Leave a comment

আমরা এর আগে ID3 অ্যালগরিদম ব্যাবহার করে ডিসিশন ট্রি গঠন করেছি। এর জন্য আমরা এন্ট্রপি আর ইনফরমেশন গেইন ব্যাবহার করেছি। কিন্তু এর কিছু সমস্যা আছে। যেসব অ্যাট্রিবিউটের বেশি ইউনিক ভ্যালু থাকে, ID3 সেইসব অ্যাট্রিবিউটের প্রতি বায়াস থাকে। অর্থাৎ মাল্টি-ভ্যালুড অ্যাট্রিবিউটকে সে বেস্ট অ্যাট্রিবিউট হিসেবে ধরে নেয় এবং রুট নোডে অ্যাসাইন করতে চায়। যেমন যদি একটা … Continue reading Construction of Decision Tree: Gain ratio

Constructing a decision tree: Entropy & Information gain

December 27, 2020January 8, 2021 / noushin.gauhar / 1 Comment

আমরা জানি ডিসিশন ট্রি গঠনের সময় আমরা ডিসিশন নোডগুলোতে বিভিন্ন অ্যাট্রিবিউট অ্যাসাইন করি। কিন্তু কোন নোডে কোনটা অ্যাসাইন করতে হবে, এইটা বুঝব কি করে? যদি আমরা র‍্যান্ডমলি অ্যাসাইন করি, তাহলে কি হবে? টার্গেট ভ্যারিয়াবল (যেইটার ভ্যালু আমরা প্রেডিক্ট করতে চাই) আর ফিচার ভ্যারিয়াবলগুলোর (বাকি সব অ্যাট্রিবিউট) মধ্যে সমান সম্পর্ক থাকেনা। কিছু কিছু ফিচার টার্গেট ভ্যারিয়াবলের … Continue reading Constructing a decision tree: Entropy & Information gain

Decision Tree: A Classification Algorithm

December 24, 2020December 24, 2020 / noushin.gauhar / Leave a comment

উপরে একটা স্পিড লিমিট সাইন এবং একটা ডায়াগ্রাম দেওয়া। ডায়াগ্রামের সবার উপরের নোডে দেওয়া “গাড়ির স্পিড >= ৫০”। এখন আমরা জিজ্ঞেস করি “গাড়ির স্পিড কি ৫০ এর সমান বা বেশি?” যদি উত্তর হয় “হ্যাঁ", তবে গাড়ির স্পিড কমায় আনতে হবে। যদি উত্তর হয় “না”, তবে গাড়ির স্পিড যেমন আছে, ওই স্পিডে চললেই হবে। এইখানে একটা … Continue reading Decision Tree: A Classification Algorithm

Sufficient Statistics: Working out different distributions (Part 3)

October 29, 2020 / noushin.gauhar / Leave a comment

আমরা আরও কয়েকটা উদাহরণ দেখব বিভিন্ন ডিস্ট্রিবিওশনের। উদাহরণগুলো বুঝার জন্য সাফিশিয়েন্ট স্ট্যাটিস্টিক্স ও ফ্যাক্টরাইজেশন থিওরেমের ধারনা থাকতে হবে। এই থ্রেডের আগের অংশ এইখানে। ৫। ধরি একটা নরমাল ডিস্ট্রিবিওশন দেওয়া আছে, যার অজানা প্যারামিটার মিন μ এবং ভ্যারিয়্যান্স σ2 = 1। μ এর সাফিশিয়েন্ট স্ট্যাটিস্টিক্স কি হবে? নরমাল ডিস্ট্রিবিওশনের pdf, $latex f(x)= \frac{1}{\sigma \sqrt{2 \pi}} \: … Continue reading Sufficient Statistics: Working out different distributions (Part 3)

Sufficient Statistics: Working out different distributions (Part 2)

October 29, 2020 / noushin.gauhar / Leave a comment

আমরা আরও কয়েকটা উদাহরণ দেখব বিভিন্ন ডিস্ট্রিবিওশনের। উদাহরণগুলো বুঝার জন্য সাফিশিয়েন্ট স্ট্যাটিস্টিক্স ও ফ্যাক্টরাইজেশন থিওরেমের ধারনা থাকতে হবে। এই থ্রেডের আগের অংশ এইখানে। ৩। একটা এক্সপোনেনশিয়াল ডিস্ট্রিবিওশন দেওয়া আছে, যার অজানা প্যারামিটার λ। λ এর সাফিশিয়েন্ট স্ট্যাটিস্টিক্স কি হবে? এক্সপোনেনশিয়াল ডিস্ট্রিবিওশনের pdf, $latex f(x)=\lambda \: e^{-\lambda x} &s=1$ জয়েন্ট pdf হবে, $latex \begin{aligned} f(x_1,x_2,...,x_n|\lambda) &= … Continue reading Sufficient Statistics: Working out different distributions (Part 2)

Sufficient Statistics: Working out different distributions (Part 1)

October 29, 2020 / noushin.gauhar / Leave a comment

আমরা ফ্যাক্টরাইজেশন থিওরেম ব্যাবহার করে বিভিন্ন প্রবাবিলিটি ডিস্ট্রিবিওশনের জন্য সাফিশিয়েন্ট স্ট্যাটিস্টিক্স বের করে দেখব। এর জন্য নিচের জিনিসগুলো খেয়াল রাখতে হবে। যে ডিস্ট্রিবিওশন দেওয়া থাকবে, তার pdf/pmf জানতে হবে। জয়েন্ট pdf/pmf বের করতে হবে। h(x) এবং gθ(t) বের করতে হবে। অজানা প্যারামিটারসহ সকল অংশ gθ(t) তে যাবে, বাদবাকি সব হবে h(x)। gθ(t) ফাংশনে অজানা প্যারামিটার এবং কন্সটান্ট বাদে … Continue reading Sufficient Statistics: Working out different distributions (Part 1)

Neyman-Fisher Factorization Criterion/Theorem: How to find a sufficient statistic?

October 26, 2020October 31, 2020 / noushin.gauhar / Leave a comment

আমরা সাফিশিয়েন্ট স্ট্যাটিস্টিক্সের কনসেপ্ট জেনেছি। এখন যদি আমরা কোন প্যারামিটারের জন্য সাফিশিয়েন্ট স্ট্যাটিস্টিক্স বের করতে চাই, তাহলে কি করব? আমরা সংজ্ঞা থেকে বলতে পারি যে র‍্যান্ডম স্যাম্পলগুলোর কন্ডিশনাল ডিস্ট্রিবিউশন বের করতে পারি, এরপর ক্যালকুলেশন করে দেখতে পারি ডিস্ট্রিবিউশন প্যারামিটারের উপর নির্ভর করে কিনা। প্রাক্টিকালি কন্ডিশনাল ডিস্ট্রিবিউশন বের করা এত সহজ না। এজন্য কোন প্যারামিটারের জন্য … Continue reading Neyman-Fisher Factorization Criterion/Theorem: How to find a sufficient statistic?

Sufficient Statistic: Definition, Example

October 26, 2020 / noushin.gauhar / 3 Comments

আমরা জানি যে পপুলেশনের ক্ষেত্রে প্যারামিটার এবং স্যাম্পলের ক্ষেত্রে স্ট্যাটিস্টিক্স বলে। যদি আমরা এমন কোন একটা স্ট্যাটিস্টিক্স জানি যেইটা দিয়ে ঐ পপুলেশনের কোন প্যারামিটার সম্পর্কে সব জানা হয়ে যায়, তাহলে সেই স্ট্যাটিস্টিক্সকে আমরা বলব Sufficient statistics। যেমন স্যাম্পলের মিন x̄ দিয়ে আমরা পপুলেশন মিন μ এস্টিমেট করতে চাই। অরিজিনাল ডাটা পয়েন্টের যা ইনফরমেশন পপুলেশন মিনে … Continue reading Sufficient Statistic: Definition, Example

Learn with Gauhar

Learn computer science with me in Bangla

K-means Clustering: Algorithm, Numeric Example, Drawbacks

Outliers: Introduction to outliers & different types of outliers

Construction of Decision Tree: Gain ratio

Constructing a decision tree: Entropy & Information gain

Decision Tree: A Classification Algorithm

Sufficient Statistics: Working out different distributions (Part 3)

Sufficient Statistics: Working out different distributions (Part 2)

Sufficient Statistics: Working out different distributions (Part 1)

Neyman-Fisher Factorization Criterion/Theorem: How to find a sufficient statistic?

Sufficient Statistic: Definition, Example