pandas groupby unique values in column

There is a way to get basic statistical summary split by each group with a single function describe(). I write about Data Science, Python, SQL & interviews. Thats because .groupby() does this by default through its parameter sort, which is True unless you tell it otherwise: Next, youll dive into the object that .groupby() actually produces. Get statistics for each group (such as count, mean, etc) using pandas GroupBy? with row/column will be dropped. Get a short & sweet Python Trick delivered to your inbox every couple of days. Each row of the dataset contains the title, URL, publishing outlets name, and domain, as well as the publication timestamp. Although it looks easy and fancy to write one-liner like above, you should always keep in mind the PEP-8 guidelines about number of characters in one line. Get a list from Pandas DataFrame column headers. Finally, you learned how to use the Pandas .groupby() method to count the number of unique values in each Pandas group. The following examples show how to use this function in different scenarios with the following pandas DataFrame: Suppose we use the pandas unique() function to display all of the unique values in the points column of the DataFrame: Notice that the unique() function includes nan in the results by default. An Categorical will return categories in the order of document.getElementById( "ak_js_1" ).setAttribute( "value", ( new Date() ).getTime() ); Statology is a site that makes learning statistics easy by explaining topics in simple and straightforward ways. Statology Study is the ultimate online statistics study guide that helps you study and practice all of the core concepts taught in any elementary statistics course and makes your life so much easier as a student. pandas groupby multiple columns . This returns a Boolean Series thats True when an article title registers a match on the search. Designed by Colorlib. result from apply is a like-indexed Series or DataFrame. Native Python list: df.groupby(bins.tolist()) pandas Categorical array: df.groupby(bins.values) As you can see, .groupby() is smart and can handle a lot of different input types. This is a good time to introduce one prominent difference between the pandas GroupBy operation and the SQL query above. In short, when you mention mean (with quotes), .aggregate() searches for a function mean belonging to pd.Series i.e. Get the free course delivered to your inbox, every day for 30 days! Using Python 3.8. RV coach and starter batteries connect negative to chassis; how does energy from either batteries' + terminal know which battery to flow back to? You can also specify any of the following: Heres an example of grouping jointly on two columns, which finds the count of Congressional members broken out by state and then by gender: The analogous SQL query would look like this: As youll see next, .groupby() and the comparable SQL statements are close cousins, but theyre often not functionally identical. To learn more about the Pandas groupby method, check out the official documentation here. © 2023 pandas via NumFOCUS, Inc. The return can be: Connect and share knowledge within a single location that is structured and easy to search. What if you wanted to group not just by day of the week, but by hour of the day? Here is how you can use it. Why does pressing enter increase the file size by 2 bytes in windows. The unique values returned as a NumPy array. cut (df[' my_column '], [0, 25, 50, 75, 100])). In this case, youll pass pandas Int64Index objects: Heres one more similar case that uses .cut() to bin the temperature values into discrete intervals: Whether its a Series, NumPy array, or list doesnt matter. The observations run from March 2004 through April 2005: So far, youve grouped on columns by specifying their names as str, such as df.groupby("state"). However, suppose we instead use our custom function unique_no_nan() to display the unique values in the points column: Our function returns each unique value in the points column, not including NaN. If you call dir() on a pandas GroupBy object, then youll see enough methods there to make your head spin! Once you get the size of each group, you might want to take a look at first, last or record at any random position in the data. In simple words, you want to see how many non-null values present in each column of each group, use .count(), otherwise, go for .size() . It simply counts the number of rows in each group. rev2023.3.1.43268. Notice that a tuple is interpreted as a (single) key. You get all the required statistics about Quantity in each group. This was about getting only the single group at a time by specifying group name in the .get_group() method. "groupby-data/legislators-historical.csv", last_name first_name birthday gender type state party, 11970 Garrett Thomas 1972-03-27 M rep VA Republican, 11971 Handel Karen 1962-04-18 F rep GA Republican, 11972 Jones Brenda 1959-10-24 F rep MI Democrat, 11973 Marino Tom 1952-08-15 M rep PA Republican, 11974 Jones Walter 1943-02-10 M rep NC Republican, Name: last_name, Length: 116, dtype: int64, , last_name first_name birthday gender type state party, 6619 Waskey Frank 1875-04-20 M rep AK Democrat, 6647 Cale Thomas 1848-09-17 M rep AK Independent, 912 Crowell John 1780-09-18 M rep AL Republican, 991 Walker John 1783-08-12 M sen AL Republican. If True, and if group keys contain NA values, NA values together Unsubscribe any time. This most commonly means using .filter() to drop entire groups based on some comparative statistic about that group and its sub-table. You can group data by multiple columns by passing in a list of columns. axis {0 or 'index', 1 or 'columns'}, default 0 Applying a aggregate function on columns in each group is one of the widely used practice to get summary structure for further statistical analysis. The total number of distinct observations over the index axis is discovered if we set the value of the axis to 0. You can analyze the aggregated data to gain insights about particular resources or resource groups. . For instance, df.groupby().rolling() produces a RollingGroupby object, which you can then call aggregation, filter, or transformation methods on. therefore does NOT sort. Youve grouped df by the day of the week with df.groupby(day_names)["co"].mean(). First letter in argument of "\affil" not being output if the first letter is "L". Count unique values using pandas groupby. Pandas .groupby() is quite flexible and handy in all those scenarios. Heres one way to accomplish that: This whole operation can, alternatively, be expressed through resampling. To subscribe to this RSS feed, copy and paste this URL into your RSS reader. For aggregated output, return object with group labels as the One way to clear the fog is to compartmentalize the different methods into what they do and how they behave. One of the uses of resampling is as a time-based groupby. Your email address will not be published. How do I select rows from a DataFrame based on column values? Lets import the dataset into pandas DataFrame df, It is a simple 9999 x 12 Dataset which I created using Faker in Python , Before going further, lets quickly understand . Return Series with duplicate values removed. The same routine gets applied for Reuters, NASDAQ, Businessweek, and the rest of the lot. When and how was it discovered that Jupiter and Saturn are made out of gas? Statology Study is the ultimate online statistics study guide that helps you study and practice all of the core concepts taught in any elementary statistics course and makes your life so much easier as a student. Slicing with .groupby() is 4X faster than with logical comparison!! When you iterate over a pandas GroupBy object, youll get pairs that you can unpack into two variables: Now, think back to your original, full operation: The apply stage, when applied to your single, subsetted DataFrame, would look like this: You can see that the result, 16, matches the value for AK in the combined result. sum () This particular example will group the rows of the DataFrame by the following range of values in the column called my_column: (0, 25] Create new column based on values from other columns / apply a function of multiple columns, row-wise in Pandas, Applications of super-mathematics to non-super mathematics. Steps Create a two-dimensional, size-mutable, potentially heterogeneous tabular data, df. a 2. b 1. the unique values is returned. used to group large amounts of data and compute operations on these Get started with our course today. So the aggregate functions would be min, max, sum and mean & you can apply them like this. To count mentions by outlet, you can call .groupby() on the outlet, and then quite literally .apply() a function on each group using a Python lambda function: Lets break this down since there are several method calls made in succession. Rather than referencing to index, it simply gives out the first or last row appearing in all the groups. Pandas: How to Get Unique Values from Index Column Note: In df.groupby(["state", "gender"])["last_name"].count(), you could also use .size() instead of .count(), since you know that there are no NaN last names. the values are used as-is to determine the groups. This includes. Before you read on, ensure that your directory tree looks like this: With pandas installed, your virtual environment activated, and the datasets downloaded, youre ready to jump in! For Series this parameter You can pass a lot more than just a single column name to .groupby() as the first argument. Python: Remove Newline Character from String, Inline If in Python: The Ternary Operator in Python. Lets continue with the same example. , Although .first() and .nth(0) can be used to get the first row, there is difference in handling NaN or missing values. In each group, subtract the value of c2 for y (in c1) from the values of c2. When calling apply and the by argument produces a like-indexed Although the article is short, you are free to navigate to your favorite part with this index and download entire notebook with examples in the end! pandas unique; List Unique Values In A pandas Column; This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. are patent descriptions/images in public domain? By clicking Post Your Answer, you agree to our terms of service, privacy policy and cookie policy. But you can get exactly same results with the method .get_group() as below, A step further, when you compare the performance between these two methods and run them 1000 times each, certainly .get_group() is time-efficient. Suppose we have the following pandas DataFrame that contains information about the size of different retail stores and their total sales: We can use the following syntax to group the DataFrame based on specific ranges of the store_size column and then calculate the sum of every other column in the DataFrame using the ranges as groups: If youd like, you can also calculate just the sum of sales for each range of store_size: You can also use the NumPy arange() function to cut a variable into ranges without manually specifying each cut point: Notice that these results match the previous example. Toss the other data into the buckets 4. Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas: Whats your #1 takeaway or favorite thing you learned? And then apply aggregate functions on remaining numerical columns. Any of these would produce the same result because all of them function as a sequence of labels on which to perform the grouping and splitting. With both aggregation and filter methods, the resulting DataFrame will commonly be smaller in size than the input DataFrame. In the output, you will find that the elements present in col_1 counted the unique element present in that column, i.e, a is present 2 times. Count total values including null values, use the size attribute: We can drop all lines with start=='P1', then groupby id and count unique finish: I believe you want count of each pair location, Species. However there is significant difference in the way they are calculated. Also note that the SQL queries above explicitly use ORDER BY, whereas .groupby() does not. Your email address will not be published. It also makes sense to include under this definition a number of methods that exclude particular rows from each group. Use df.groupby ('rank') ['id'].count () to find the count of unique values per groups and store it in a variable " count ". Your home for data science. This is not true of a transformation, which transforms individual values themselves but retains the shape of the original DataFrame. Before you get any further into the details, take a step back to look at .groupby() itself: What is DataFrameGroupBy? Why is the article "the" used in "He invented THE slide rule"? This only applies if any of the groupers are Categoricals. The pandas .groupby() and its GroupBy object is even more flexible. With groupby, you can split a data set into groups based on single column or multiple columns. Contents of only one group are visible in the picture, but in the Jupyter-Notebook you can see same pattern for all the groups listed one below another. For one columns I can do: g = df.groupby ('c') ['l1'].unique () that correctly returns: c 1 [a, b] 2 [c, b] Name: l1, dtype: object but using: g = df.groupby ('c') ['l1','l2'].unique () returns: In this way, you can apply multiple functions on multiple columns as you need. If you want a frame then add, got it, thanks. It doesnt really do any operations to produce a useful result until you tell it to. A-143, 9th Floor, Sovereign Corporate Tower, We use cookies to ensure you have the best browsing experience on our website. However, it is never easy to analyze the data as it is to get valuable insights from it. Get a list of values from a pandas dataframe, Converting a Pandas GroupBy output from Series to DataFrame, Selecting multiple columns in a Pandas dataframe, Apply multiple functions to multiple groupby columns, How to iterate over rows in a DataFrame in Pandas. is there a way you can have the output as distinct columns instead of one cell having a list? As per pandas, the function passed to .aggregate() must be the function which works when passed a DataFrame or passed to DataFrame.apply(). With that in mind, you can first construct a Series of Booleans that indicate whether or not the title contains "Fed": Now, .groupby() is also a method of Series, so you can group one Series on another: The two Series dont need to be columns of the same DataFrame object. Find centralized, trusted content and collaborate around the technologies you use most. All that is to say that whenever you find yourself thinking about using .apply(), ask yourself if theres a way to express the operation in a vectorized way. Now, run the script to see how both versions perform: When run three times, the test_apply() function takes 2.54 seconds, while test_vectorization() takes just 0.33 seconds. This tutorial is meant to complement the official pandas documentation and the pandas Cookbook, where youll see self-contained, bite-sized examples. To learn more, see our tips on writing great answers. Here, we can count the unique values in Pandas groupby object using different methods. One term thats frequently used alongside .groupby() is split-apply-combine. The next method quickly gives you that info. In that case, you can take advantage of the fact that .groupby() accepts not just one or more column names, but also many array-like structures: Also note that .groupby() is a valid instance method for a Series, not just a DataFrame, so you can essentially invert the splitting logic. I want to do the following using pandas's groupby over c0: Group rows based on c0 (indicate year). That result should have 7 * 24 = 168 observations. Using Python 3.8 Inputs The result may be a tiny bit different than the more verbose .groupby() equivalent, but youll often find that .resample() gives you exactly what youre looking for. So, as many unique values are there in column, those many groups the data will be divided into. RV coach and starter batteries connect negative to chassis; how does energy from either batteries' + terminal know which battery to flow back to? The result set of the SQL query contains three columns: In the pandas version, the grouped-on columns are pushed into the MultiIndex of the resulting Series by default: To more closely emulate the SQL result and push the grouped-on columns back into columns in the result, you can use as_index=False: This produces a DataFrame with three columns and a RangeIndex, rather than a Series with a MultiIndex. Exactly, in the similar way, you can have a look at the last row in each group. Hash table-based unique, groupby (pd. Your email address will not be published. category is the news category and contains the following options: Now that youve gotten a glimpse of the data, you can begin to ask more complex questions about it. Moving ahead, you can apply multiple aggregate functions on the same column using the GroupBy method .aggregate(). How do create lists of items for every unique ID in a Pandas DataFrame? how would you combine 'unique' and let's say '.join' in the same agg? You can use the following syntax to use the, This particular example will group the rows of the DataFrame by the following range of values in the column called, We can use the following syntax to group the DataFrame based on specific ranges of the, #group by ranges of store_size and calculate sum of all columns, For rows with a store_size value between 0 and 25, the sum of store_size is, For rows with a store_size value between 25 and 50, the sum of store_size is, If youd like, you can also calculate just the sum of, #group by ranges of store_size and calculate sum of sales. By clicking Accept all cookies, you agree Stack Exchange can store cookies on your device and disclose information in accordance with our Cookie Policy. aligned; see .align() method). Almost there! The Quick Answer: Use .nunique() to Count Unique Values in a Pandas GroupBy Object. As you can see it contains result of individual functions such as count, mean, std, min, max and median. By default group keys are not included Curated by the Real Python team. pandas.core.groupby.SeriesGroupBy.nsmallest, pandas.core.groupby.SeriesGroupBy.is_monotonic_increasing. And also, to assign groupby output back to the original dataframe, we usually use transform: Typeerror: Str Does Not Support Buffer Interface, Why Isn't Python Very Good for Functional Programming, How to Install Python 3.X and 2.X on the Same Windows Computer, Find First Sequence Item That Matches a Criterion, How to Change the Figure Size with Subplots, Python Dictionary:Typeerror: Unhashable Type: 'List', What's the Difference Between _Builtin_ and _Builtins_, Inheritance of Private and Protected Methods in Python, Can You Use a String to Instantiate a Class, How to Run a Function Periodically in Python, Deleting List Elements Based on Condition, Global Variable from a Different File Python, Importing Modules: _Main_ VS Import as Module, Find P-Value (Significance) in Scikit-Learn Linearregression, Type Hint for a Function That Returns Only a Specific Set of Values, Downloading with Chrome Headless and Selenium, Convert Floating Point Number to a Certain Precision, and Then Copy to String, What Do I Do When I Need a Self Referential Dictionary, Can Elementtree Be Told to Preserve the Order of Attributes, How to Filter a Django Query with a List of Values, How to Set the Figure Title and Axes Labels Font Size in Matplotlib, How to Prevent Python's Urllib(2) from Following a Redirect, Python: Platform Independent Way to Modify Path Environment Variable, Make a Post Request While Redirecting in Flask, Valueerror: Numpy.Dtype Has the Wrong Size, Try Recompiling, How to Make Python Scripts Executable on Windows, About Us | Contact Us | Privacy Policy | Free Tutorials. You need to specify a required column and apply .describe() on it, as shown below . Please note that, the code is split into 3 lines just for your understanding, in any case the same output can be achieved in just one line of code as below. not. pd.Series.mean(). The total number of distinct observations over the index axis is discovered if we set the value of the axis to 0. Whereas, if you mention mean (without quotes), .aggregate() will search for function named mean in default Python, which is unavailable and will throw an NameError exception. is unused and defaults to 0. All Rights Reserved. It basically shows you first and last five rows in each group just like .head() and .tail() methods of pandas DataFrame. Theres much more to .groupby() than you can cover in one tutorial. Suspicious referee report, are "suggested citations" from a paper mill? Launching the CI/CD and R Collectives and community editing features for How to combine dataframe rows, and combine their string column into list? as in example? @AlexS1 Yes, that is correct. Lets see how we can do this with Python and Pandas: In this post, you learned how to count the number of unique values in a Pandas group. pandas.unique# pandas. Similar to what you did before, you can use the categorical dtype to efficiently encode columns that have a relatively small number of unique values relative to the column length. Once you get the number of groups, you are still unware about the size of each group. Related Tutorial Categories: Index(['Wednesday', 'Wednesday', 'Wednesday', 'Wednesday', 'Wednesday'. Therefore, you must have strong understanding of difference between these two functions before using them. Note: Im using a self created Dummy Sales Data which you can get on my Github repo for Free under MIT License!! Earlier you saw that the first parameter to .groupby() can accept several different arguments: You can take advantage of the last option in order to group by the day of the week. Split along rows (0) or columns (1). No doubt, there are other ways. From the pandas GroupBy object by_state, you can grab the initial U.S. state and DataFrame with next(). Hosted by OVHcloud. In SQL, you could find this answer with a SELECT statement: You call .groupby() and pass the name of the column that you want to group on, which is "state". . Leave a comment below and let us know. This argument has no effect if the result produced Whats important is that bins still serves as a sequence of labels, comprising cool, warm, and hot. The pandas GroupBy method get_group() is used to select or extract only one group from the GroupBy object. Once you split the data into different categories, it is interesting to know in how many different groups your data is now divided into. Out the first or last row in each group two-dimensional, size-mutable, potentially heterogeneous data. The value of the lot you want a frame then add, got it, many... If any of the uses of resampling is as a ( single ) key group contain. Write about data Science, Python, SQL & interviews used as-is to determine the groups, whereas (... To specify a required column and apply.describe ( ) is quite flexible and handy all! As you can have the output as distinct columns instead of one cell having a list documentation.... Logical comparison! are made out of gas URL, publishing outlets name, and pandas... Amounts of data and compute operations on these get started with our today. Pressing enter increase the file size by 2 bytes in windows of days such as count,,. The uses of resampling is as a time-based GroupBy value of c2 is meant to complement the documentation... Take a step back to look at the last row appearing in those! At the last row in each group launching the CI/CD and R Collectives and editing. Unique ; list unique values in pandas GroupBy method get_group ( ) belonging pd.Series. Different methods for free under MIT License! by default group keys contain NA values, NA values together any... ),.aggregate ( ) as the publication timestamp apply multiple aggregate would. Use the pandas GroupBy method.aggregate ( ) you learned how to combine DataFrame rows, and pandas... The lot axis is discovered if we set the value of the dataset contains the title, URL publishing... Official documentation here ensure you have the best browsing experience on our website applied Reuters! Explicitly use ORDER by, whereas.groupby ( ) to count the number of groups, you can a! Can group data by multiple columns this parameter you can pass a lot than... Of service, privacy policy and cookie pandas groupby unique values in column belonging to pd.Series i.e Newline. In a pandas column ; this work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License as many values... Those many groups the data as it is never easy to analyze data! Whole operation can, alternatively, be expressed through resampling paste this URL into your RSS reader paper?. About Quantity in each pandas group what if you wanted to group not just by of! Attribution-Sharealike 4.0 International License it discovered that Jupiter and Saturn are made out of gas under License. Pressing enter increase the file size by 2 bytes in windows subscribe to this RSS feed copy! On our website increase the file size by 2 bytes in windows, be expressed through resampling a way can! In each group functions before using them or resource groups object by_state, can..., NASDAQ, Businessweek, and combine their String column into list just a location. The index axis is discovered if we set the value of the axis to 0, are `` citations. Some comparative statistic about that group and its sub-table agree to our terms of service, privacy policy and policy! Drop entire groups based on column values Quick Answer: use.nunique ( ) to count the unique values returned... A paper mill Boolean Series thats True when an article title registers a match the! Used as-is to determine the groups and Saturn are made out of?. Why is the article `` the '' used in `` He invented the slide rule '' a Series. Sum and mean & you can cover in one tutorial two-dimensional, size-mutable, potentially heterogeneous tabular,! There a way you can have a look at.groupby ( ) to entire. Discovered that Jupiter and Saturn are made out of gas the publication timestamp are. Transformation, which transforms individual values themselves but retains the shape of the dataset contains title!, whereas.groupby ( ) method to count unique values is returned the rest of the are. Cookbook, where youll see enough methods there to make your head spin free course delivered to inbox. Match on the search the.get_group ( ) does not, when you mention mean ( quotes. Our website a Creative Commons Attribution-ShareAlike 4.0 International License licensed under a Commons... Step back to look at the last row appearing in all those scenarios method.aggregate ( ) quite! Such as count, mean pandas groupby unique values in column etc ) using pandas GroupBy method get_group ( ) is quite flexible handy! Newline Character from String, Inline if in Python any operations to produce a useful result you. Tower, we use cookies to ensure you have the output as distinct instead. And community editing features for how to use the pandas GroupBy method.aggregate ( ) is used group! Group data by multiple columns by passing in a pandas column ; this work is under. 168 observations pandas.groupby ( ) to drop entire groups based on some comparative statistic about that and! Co '' ].mean ( ) is used to select or extract one. String, Inline if in Python belonging to pd.Series i.e at the last row appearing in the. By clicking Post your Answer, you can pass a lot more than just a function! ( 0 ) or columns ( 1 ) a transformation, which transforms individual values themselves but the. Values in each group title registers a match on the same agg it simply counts the number of that... Can pass a lot more than just a single location that is structured and easy to search select from!, whereas.groupby ( ) to drop entire groups based on some comparative statistic about that group its... Want a frame then add, got it, as shown below using different methods like-indexed Series DataFrame... Values is returned see self-contained, bite-sized examples tell it to documentation here, whereas.groupby ( to! Initial U.S. state and DataFrame with next ( ) on it,.... Logical comparison! like-indexed Series or DataFrame a Boolean Series thats True when an article registers... The initial U.S. state and pandas groupby unique values in column with next ( ) is 4X faster than with logical comparison!! We can count the unique values in each group ( such as count mean. & you can pass a lot more than just a single function describe ( ) is split-apply-combine than. This was about getting only the single group at a time by specifying group in... From each group get all the groups then youll see enough methods there to make head..., URL, publishing outlets name, and if group keys are not included Curated by the Python... Are made out of gas resource groups official documentation here one of the groupers are.! ', 'Wednesday ' your inbox every couple of days method to count the number of distinct observations the! Operation and the SQL queries above explicitly use ORDER by, whereas.groupby ( ) drop! Answer: use.nunique ( ) to drop entire groups based on single or! Required column and apply.describe ( ) method last row appearing in all groups... A ( single ) key quite flexible and handy in all the groups on our.! In pandas GroupBy object column values group, subtract the value of the dataset contains the,... Rest of the uses of resampling is as a ( single ) key you can cover in one.... Rss feed, copy and paste this URL into your RSS reader resampling as! With quotes ),.aggregate ( ) method to count unique values in group... Letter is `` L '' two functions before using them any operations to a. Exactly, in the.get_group ( ) itself: what is DataFrameGroupBy and median on numerical... Pd.Series i.e any operations to produce a useful result until you tell it to size-mutable! Boolean Series thats True when an article title registers a match on the same routine gets applied Reuters... For Series this parameter you can cover in one tutorial get all the statistics! Distinct observations over the index axis is discovered if we set the value of the axis to 0 this... Max and median subscribe to this RSS feed, copy and paste this URL into your reader. * 24 = 168 observations title registers a match on the same routine gets applied Reuters... Grab the initial U.S. state and DataFrame with next ( ) itself: what is DataFrameGroupBy True when article...: the Ternary Operator in Python in pandas GroupBy object is even more flexible to produce a result. Day for 30 days size than the input DataFrame Answer: use.nunique ( ) and its.. Time to introduce one prominent difference between these two functions before using them and... Get on my Github repo for free under MIT License! is not True of a transformation which. Good time to introduce one prominent difference between the pandas GroupBy operation the... To search dir ( ) not included Curated by the Real Python team on these get started our... Is there a way you can have a pandas groupby unique values in column at.groupby ( than. Have strong understanding of difference between the pandas GroupBy object, then see... Is interpreted as a ( single ) key data to gain insights about pandas groupby unique values in column resources or resource.. As many unique values in a pandas GroupBy object using different methods course delivered to inbox. ( 1 ) location that is structured and easy to search contains the title, URL, outlets... My Github repo for free under MIT License! 30 days apply.describe ( ) does not a. Rss feed, copy and paste this URL into your RSS reader pandas Cookbook, where youll see enough there...

pandas groupby unique values in column 2023