News from the EDGE: Week of January 4, 2021

2021 has finally arrived, and not a moment too soon! But does it really feel any different so far from 2020? Maybe not, but what is different is having not just one, but TWO industry experts to cover this week’s digital marketing news. Host Erin Sparks and Studio Creative Director Jacob Mann along with guests Mordy Oberstein, Wix’s SEO Liaison, and George Nguyen, Editor at Third Door Media, discussed some of the latest headlines in this first-of-the-year news roundup from the award-winning EDGE of the Web podcast:

00:03:44

TikTok Faces New Legal Challenge Over its Tracking of Underage User Data

From Andrew Hutchinson on Social Media Today we learn that TikTok Faces New Legal Challenge Over its Tracking of Underage User Data. Despite facing an array of challenges – which, at times, threatened the app’s very existence—TikTok has continued to grow in 2020 and looks set to become an even more important, and influential platform in the year ahead. But while TikTok has seemingly avoided a ban in the US (for now), the app remains under scrutiny, due to concerns about its impact on young users, and international data security considerations, given its Chinese ownership.

  • Erin Sparks: This legal challenge is happening in the UK on behalf of an unidentified twelve-year-old girl from London. The action names six different companies responsible for TikTok and its predecessor and includes damage claims based on loss of control of personal data and use of that data in violation of UK and EU data protection laws. The EU and UK are far ahead of the US on data privacy, and we need to catch up. And this is a continuation of pushback from digital consumers. It’s one thing when adults waive their rights to be on these various platforms, but when it comes to children, we have to protect our children. 
  • Mordy Oberstein: The problem here is it’s just too easy to give away your information, there are so few barriers. Like these sites that ask you if you’re 18 or older. You just say yes and you’re in! 
  • George Nguyen: Yes, everything is touchier when you’re talking about children. And now children have the chance to make these mistakes online at a much earlier age than many of us adults did when we were growing up. How children interact with these platforms is something that really needs to change. 
  • Erin Sparks: The age gate question simply isn’t good enough when it comes to kids. There has to be more responsibility on the part of big tech to verify age that’s not just a click-through kind of scenario. And I’m saying it’s the platforms that have to take responsibility for this. 
  • Mordy Oberstein: The simple button click is just for the platforms to cover themselves legally. It doesn’t indicate they actually care about children’s privacy, because if they did they wouldn’t make it so easy for them to get in. 
  • George Nguyen: In some countries like South Korea your registration on these platforms is tied to your citizen identification number (like a Social Security number) so they can do age verification. But in the US we’ve grown used to anonymity or not using our real identity on platforms if we don’t want to. But that’s a real trade-off between privacy and anonymity.
  • Erin Sparks: That would not go over well here in the US at all. So many people are already fearful and paranoid. I can’t imagine ever seeing that happen.

00:10:13

John Mueller of Google Helping Webmasters on New Year’s 2021

Over on Search Engine Roundtable is a story by Barry Schwartz about John Mueller of Google Helping Webmasters on New Year’s 2021. John Mueller of Google was online helping webmasters in public forums on both New Year’s Eve 2020 and New Year’s Day 2021.

  • Erin Sparks: Some of John’s contributions were amusing, but I’m thinking to myself, what the heck is he doing online on New Year’s Eve and New Year’s Day responding to things? I did like Lily Ray’s comment about how she wishes SEO tools wouldn’t send you all these alerts about things that in the end are totally non-issues, to which John said sometimes a little busy work helps you clear your mind! 
  • Mordy Oberstein: I have to agree about the SEO alerts from platforms because so many of them are like SEO spam. Like, “Your links are going to result in a manual action.” Really? The platform doesn’t know that. The problem is that the platform has to speak to both experienced and inexperienced SEOs. If you’re a newbie those alerts might be great, but if you’re experienced then it’s just spam.
  • Erin Sparks: Yes, you should be able to filter some of the garbage alerts out somehow.
  • George Nguyen: It does seem like they could be streamlined or triaged differently, like here’s all the low-level stuff condensed or grouped in one place so the expert users can just ignore all those. 

00:15:37

Biased Language Models Can Result from Internet Training Data

George Nguyen on Search Engine Land reports how Biased Language Models Can Result from Internet Training Data. Last year, Google announced BERT, calling it the largest change to its search system in nearly five years, and now, it powers almost every English-based query. However, language models like BERT are trained on large datasets, and there are potential risks associated with developing language models this way. AI researcher Timnit Gebru’s departure from Google is tied to these issues, as well as concerns over how biased language models may affect search for both marketers and users.

  • George Nguyen: BERT as a language model had to be trained and you need a really huge data set to train it, so what better place to find such a large data set than the internet, right? But the internet is very biased when you think about the countries that had it first and the languages spoken in those countries. If you speak that language, then obviously you won’t notice the bias to begin with. And they’re trying to take steps to remove the bias, but if the bias is trained into the language model and then the language model is inserted into an algorithm, and then that algorithm gets used for things like auto-suggest, or the ranking process for search, then the bias ends up being perpetuated. 
  • Erin Sparks: And we’re not talking bias in terms or morals or philosophy, we’re just talking bias in the language because if it’s trained on English, then it’s kind of ignoring the rest of the world that doesn’t speak English.
  • George Nguyen: Yes, and your native language that you think has a huge influence on how your mind maps things. In many Asian languages, there’s a less rigid distinction between present tense and future tense, so you’re not always thinking in present tense. In English when you say “I will save” it implies you’re not saving now in the present. But in many languages, the same phrase can mean both “I will save” and “I am saving.” If your language model doesn’t capture these nuances, then it’s not serving a lot of people. And if we can’t keep this bias in check then we need to be able to label something as being biased, such as in search results. 
  • Mordy Oberstein: There was a study some years ago about image datasets that were put together by white Caucasians and guess what, the datasets where skewed towards white Caucasians. The didn’t do it on purpose or with nefarious intentions, it just happened because that’s their relationship to the world. So I think it’s important to keep in mind that it’s very easy for the biases to sneak in without you even being aware it’s happening. Some years back Google actually tried to make image data more inclusive. It was noticed how bad it was at recognizing wedding photos from cultures in which weddings don’t conform to the typical Judeo-Christian concept of a wedding. It’s a problem, and Google knows it’s a problem whether they’re willing to admit it or not.
  • Erin Sparks: Like the study of facial recognition software where there was less than a 1% error rate when it came to light-skinned women but a 35% error rate for dark-skinned women! 
  • George Nguyen: This potential for bias in training language models on large data sets is what the story is about, but it’s been crowded out by the controversy around the person who wrote the paper about bias. She is a leading AI researcher and a black woman and Google is letting her go. Was it a voluntary resignation? It’s hard to tell. It feels like she’s being forced out of the company. And it seems like a lot of people within Google who have been speaking up and speaking out about diversity and harassment and so on at Google are being forced out of the company.
  • Mordy Oberstein: And it’s weird how Google can be so good at so much of what it does and yet be so terrible in this particular area. The optics for them really just are not good at all. It mystifies my why Google wouldn’t just own the fact that there is bias in their datasets. That should be starting point, not trying to deny or refute that bias is there. Own it and commit to working to solve it. You know, Bing came right out and said, yes of course there’s bias in language models and we need to work on that. But somehow with Google there simply wasn’t any of the open dialogue or transparency, it feels like it’s all about trying to hide it under the rug or something. Not a good look.
  • George Nguyen: And the problem has very practical and real consequences in search because Google can’t ask you to clarify what you’re trying to search for. So there was controversy around someone who searched for “Jewish baby strollers” and what came up on the image search were a lot of horrifyingly offensive anti-Semitic memes. You can’t blame google for wanting to shy away from responsibility around things like that, and yet as the biggest player everyone relies on for search, they kind of have to deal with it, right? Because the impacts on people are very real. 
  • Mordy Oberstein: Google really fumbled on its response. Now, there’s no way Google could have predicted that searching on “Jewish baby strollers” would bring up a meme with grill on it. But could that content have carried a label like “WARNGING: offensive content.” They couldn’t have known that would come, but now that they DO know, why wouldn’t they label that content?
  • George Nguyen: Facebook does a lot moderating of content using humans, and the stories are out there about the PTSD they experience. Google is dealing with far more content, so there’s no way they’re going to do it with humans. It would just be a really bad idea. Users can give feedback on a SERP, but it’s at the very bottom of the page in like six-point font and most people never end up going “below the fold,” right? So Google’s desire to get feedback from users is essentially nonexistent.

Connect with George Nguyen, Editor at Third Door Media

Twitter: @geochingu (https://twitter.com/geochingu

LinkedIn: https://www.linkedin.com/in/george-c-nguyen

 

Connect with Mordy Oberstein, Wix’s SEO Liaison

Twitter: @MordyOberstein (https://twitter.com/MordyOberstein

LinkedIn: https://www.linkedin.com/in/mordy-oberstein-12551715/ 

 

Connect with Erin Sparks, Host of EDGE of the Web and Owner of Site Strategics

Twitter: @ErinSparks (https://twitter.com/erinsparks)

LinkedIn: https://www.linkedin.com/in/erinsparks/