Gender-name tool to change the game for patent diversity analysis

Gender-name tool to change the game for patent diversity analysis

“This should be easy” are words I regret saying to Suzanne Harrison, USIPA diversity & inclusion co-chair. This was back in 2019, at a meeting among 20 companies discussing the start of efforts that would ultimately lead to the Patent Diversity Pledge. Five years later, I still see companies struggling to get full cooperation from their HR departments to obtain gender data of their inventor pools. This is why we are excited to be releasing a new tool – GnE (pronounced ‘genie’) – to help companies estimate the gender of their inventors.

In the last five years, we have learned a lot about generating shareable inventorship and gender data, as well as other underrepresented inventor groups. However, HR departments are generally hesitant to share such data. Even if they will give you internal access, they may not be comfortable with external sharing.

To accelerate the diversity analysis, a different approach is used: GnE estimates the gender of a person from their name. While not perfect, it is shockingly good. It allows companies to begin implementing best practices and changes and see how their efforts pay off. GnE also supports companies in making a stronger argument to HR for obtaining their real data. Additionally, we tested algorithmic approaches and found that they were capable of delivering extremely high accuracy (see “Measuring diversity in invention and patenting is easier said than done” on the IAM platform).

Our team’s first effort to reduce the data access hurdle was an Excel-only tool. Today, we are pleased to unveil our newest solution: GnE. GnE is a free and open-source macOS application that analyses your patent data to estimate the gender of your inventor pool and is available here. We plan to post the source code to GitHub shortly with an open-source CC-BY-SA 4.0 licence to match that of WIPO’s World Gender Name Dictionary 2.0.

GnE is a major improvement on the previous tool in two key ways:

  • by updating to use the larger WIPO World Gender Name Dictionary 2.0, companies can benefit from the much more comprehensive coverage of international names; and
  • as a macOS-native application, the tool eliminates the need for users to manually manipulate spreadsheets.

GnE runs entirely locally – no data is sent up to the internet or downloaded – you can safely use it to analyse the gender on internal confidential data, such as invention disclosure submissions or unpublished applications.

GnE expects an input file (Excel or CSV) with columns requesting the following data:

  • first name (required – eg, “Erik” not “Erik Oliver”);
  • country (required – two-letter WIPO country code, eg, “US”);
  • disclosure or patent ID (recommended – this would be your identifier for a patent or disclosure); and
  • person ID (recommended – any unique identifier for a person eg, email address or employee ID).

Here is some sample data that might be input into the tool:

Inventor First

Disclosure Number

Country

Business Unit

Email

Abdi

2020-HQ-12

US

BU1

[email protected]

Adrian

2020-HQ-22

US

BU1

[email protected]

Alene

2020-HQ-37

US

BU2

[email protected]

Alistair

2020-HQ-12

US

BU3

[email protected]

Allyn

2020-HQ-13

US

BU4

[email protected]

Angelo

2020-HQ-17

US

BU3

[email protected]

GnE allows the user to indicate which columns serve which purpose and extra columns are not a problem. It then creates a new Excel with a summary including four sets of statistics:

  • basic counts;
  • women inventor rate estimate;
  • fractional inventorship rate estimate; and
  • patent output estimate.

Visual Summary of the Different Outputs

image-20230323101412-1

The basic counts provide some context about the input file (ie, number of distinct patents/disclosures in the input, number of unique inventors and total number of inventors).

The women inventor rate estimate measures the share of women among all inventors in the data set. So, if you have three unique inventors in the pool and the tool estimates that one is a woman, then the women inventor rate would be 33%.

This measure is a good starting point for understanding the composition of your inventor pool and the data point most directly comparable to the percentage of staff companywide who are women.

The fractional inventorship rate estimate measures the proportion of women inventors across the disclosures or patents analysed. For example, if in our input four patents were being analysed, then each disclosure would be worth one point. A disclosure estimated to have one woman and one man would be 0.50 women and 0.50 men. Adding all of those up provides the fractional inventorship rate (eg, 1.0 versus 4.0 total). This measure is quite powerful for helping understand not only the proportion of women inventors, but also the level of participation.

The patent output estimate provides some additional output measures, such as number of items with at least one woman, at least one man, solo women inventors and solo men inventors. These give some additional perspectives on participation that may be helpful for understanding how your programme is operating.

No single one of these measures is right or wrong. Each provides different information about your inventor pool and the estimate of participation by women. All can provide a useful starting point for benchmarking and measuring the impact of initiatives.

Cautions and limitations

The current tool reflects the inherent limitations of using first names to identify genders. Such limits might render some LGBTQ+ individuals invisible and/or misgender some individuals. Despite those limitations, the value of the tool for starting the analysis and change process to better include women inventors makes it a valuable tool on the path to greater diversity.

Inclusivity Insights is a regular feature in which companies share stories, learnings, and experiences of their D&I journey related to IP and innovation with the IAM audience. Previous articles in the series:

Neurodiversity and mental health: Celebrating difference in the IP profession

Finding ‘lost Einsteins’: US patent advisory committee calls for more diverse inventors

Corning’s journey toward applying a diversity and inclusion lens to IP

Increasing diversity in innovation sprints

Diversity, equity & inclusion matter: a son’s perspective

IP and innovation inclusion takes a village: a Meta perspective

How the Pure patent programme is engineered for inclusive innovation

Diversity pledge companies now number more than 50

Closing diversity gaps in patenting: current initiatives and the HP perspective

How Seagate is working to advance diversity and inclusion in patenting

Betting on diversity in innovation 

Unlock unlimited access to all IAM content